Re: [HACKERS] Re: Add generate_series(date, date) and generate_series(date, date, integer)

2016-02-02 Thread David Steele
On 2/2/16 1:01 PM, Corey Huinker wrote:
> Doh, I left that comment to myself in there. :)

If I had a dime for every time I've done that...

> The corresponding structs in timestamp.c and int.c have no comment, so
> suggestions of what should be there are welcome. In the interim I put in
> this:
> 
> /* state for generate_series_date(date,date,[step]) */

I think that's fine -- whoever commits it may feel otherwise.

> Do you have any insight as to why the documentation test failed?
> 
> In the mean time, here's the updated patch.

I didn't pass the docs on account of the wonky comment since I consider
code comments to be part of the docs.  The sgml docs build fine and look
good to me.

I'll retest and update the review accordingly.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


[HACKERS] left, right, full sort merge join plan

2016-02-02 Thread CK Tan
Hi Hackers,

I am looking for some help in creating LEFT/RIGHT/FULL sort-merge-join.
Does anyone have a complete and reproducible script that would generate
those plans? Can I find it in the regression test suite? If not, how do you
exercise those code path for QA purposes?

Thanks!

-cktan


Re: [HACKERS] Re: Add generate_series(date, date) and generate_series(date, date, integer)

2016-02-02 Thread Corey Huinker
Doh, I left that comment to myself in there. :)

The corresponding structs in timestamp.c and int.c have no comment, so
suggestions of what should be there are welcome. In the interim I put in
this:

/* state for generate_series_date(date,date,[step]) */


Extra linefeed after struct removed.

Do you have any insight as to why the documentation test failed?

In the mean time, here's the updated patch.


On Tue, Feb 2, 2016 at 11:41 AM, David Steele  wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:tested, failed
>
> Everything looks good except for two minor issues:
>
> 1) There should be an informative comment for this struct:
>
> +/* Corey BEGIN */
> +typedef struct
> +{
> +   DateADT current;
> +   DateADT stop;
> +   int32   step;
> +} generate_series_date_fctx;
>
> 2) There's an extra linefeed after the struct.  Needed?
>
> Regards,
> -David
>
> The new status of this patch is: Waiting on Author
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


v3.generate_series_date.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Add generate_series(date, date) and generate_series(date, date, integer)

2016-02-02 Thread Corey Huinker
>
>
> > Do you have any insight as to why the documentation test failed?
> >
> > In the mean time, here's the updated patch.
>
> I didn't pass the docs on account of the wonky comment since I consider
> code comments to be part of the docs.  The sgml docs build fine and look
> good to me.
>
>
Ah, understood. I rebuilt the docs thinking there was a make check failure
I missed, then viewed the html in links (I develop on an AWS box) and saw
nothing untoward.


> I'll retest and update the review accordingly.
>

Thanks!


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Jim Nasby

On 2/2/16 10:34 AM, Joshua D. Drake wrote:

Auditing is a pretty security/enterprisey-related thing that could do
with the "officially considered to of the PostgreSQL project standard
and ready for production" rubber-stamp that tends to go along with most
end-user/admin-oriented stuff shipped in the tarball.


Which is exactly why I think .Org needs an official "Extensions" project
which would completely eliminate these arguments. A project team
explicitly for vetting extensions.


Yeah, it's disappointing that PGXN doesn't seem to have really taken 
off. I'm sure a big part of that is the need for even SQL extensions to 
have server access, but I suspect part of it is because it's a separate 
project.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: make behavior of all versions of the "isinf" function be similar

2016-02-02 Thread Tom Lane
Vitaly Burovoy  writes:
> On 1/31/16, Tom Lane  wrote:
>> 2. POSIX:2008 only requires that "The isinf() macro shall return a
>> non-zero value if and only if its argument has an infinite value."

> Ok, then I'll use "is_infinite" from "float.c".

Yeah, that's good.

> But why functions' (in "src/port/isinf.c") behavior are different? It
> is a bit confusing…

Probably the authors of those different implementations were making it
match the behavior of whatever isinf() they had locally.  I don't feel
a need to change isinf.c --- it should be zero-maintenance at this
point, especially if we suspect it is no longer used anywhere.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Jim Nasby

On 2/2/16 5:00 AM, Simon Riggs wrote:

Since you've written the email here, I'd ask that you join our community
and use your knowledge and passion to make things happen.


+1. Kudos for speaking up in the first place, but it's clear that right 
now the biggest thing holding Postgres back is lack of reviewers, 
followed by lack of developers. If your company put even 10% of what it 
would pay for Oracle or MSSQL licensing back into the community (either 
via direct employee involvement or by funding development) then auditing 
could have moved a lot faster than it did.


It would also help if the community better publicized ways that 
companies could give back.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ununsed member in printQueryOpt

2016-02-02 Thread Tom Lane
"Dickson S. Guedes"  writes:
> I found the following code in src/bin/psql/print.h:155 (master 1d0c3b3f8a)
> boolquote;  /* quote all values as much as possible */
> But I didn't found any other references to that "quote" and, after
> removing that line,
> the code still compiles without any error/warning.

Yeah, you're right, this seems to be dead code.  Removed.
Thanks for noticing!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a raft of parallelism-related bug fixes

2016-02-02 Thread Robert Haas
On Mon, Oct 19, 2015 at 12:02 PM, Robert Haas  wrote:
> On Sat, Oct 17, 2015 at 9:16 PM, Andrew Dunstan  wrote:
>> If all that is required is a #define, like CLOBBER_CACHE_ALWAYS, then no
>> special buildfarm support is required - you would just add that to the
>> animal's config file, more or less like this:
>>
>>  config_env =>
>>  {
>>  CPPFLAGS => '-DGRATUITOUSLY_PARALLEL',
>>  },
>>
>> I try to make things easy :-)
>
> Wow, that's great.  So, I'll try to rework the test code I posted
> previously into something less hacky, and eventually add a #define
> like this so we can run it on the buildfarm.  There's a few other
> things that need to get done before that really makes sense - like
> getting the rest of the bug fix patches committed - otherwise any
> buildfarm critters we add will just be permanently red.

OK, so after a bit more delay than I would have liked, I now have a
working set of patches that we can use to ensure automated testing of
the parallel mode infrastructure.  I ended up doing something that
does not require a #define, so I'll need some guidance on what to do
on the BF side given that context.  Please find attached three
patches, two of them for commit.

group-locking-v1.patch is a vastly improved version of the group
locking patch that we discussed, uh, extensively last year.  I realize
that there was a lot of doubt about this approach, but I still believe
it's the right approach, I have put a lot of work into making it work
correctly, I don't think anyone has come up with a really plausible
alternative approach (except one other approach I tried which turned
out to work but with significantly more restrictions), and I'm
committed to fixing it in whatever way is necessary if it turns out to
be broken, even if that amounts to a full rewrite.  Review is welcome,
but I honestly believe it's a good idea to get this into the tree
sooner rather than later at this point, because automated regression
testing falls to pieces without these changes, and I believe that
automated regression testing is a really good idea to shake out
whatever bugs we may have in the parallel query stuff.  The code in
this patch is all mine, but Amit Kapila deserves credit as co-author
for doing a lot of prototyping (that ended up getting tossed) and
testing.  This patch includes comments and an addition to
src/backend/storage/lmgr/README which explain in more detail what this
patch does, how it does it, and why that's OK.

force-parallel-mode-v1.patch is what adds the actual infrastructure
for automated testing.  You can set force_parallel_mode=on to force
queries to be ru in a worker whenever possible; this can help test
whether your user-defined functions have been erroneously labeled as
PARALLEL SAFE.  If they error out or misbehave with this setting
enabled, you should label them PARALLEL RESTRICTED or PARALLEL UNSAFE.
If you set force_parallel_mode=regress, then some additional changes
intended specifically for regression testing kick in; those changes
are intended to ensure that you get exactly the same output from
running the regression tests with the parallelism infrastructure
forcibly enabled that you would have gotten anyway.  Most of this code
is mine, but there are also contributions from Amit Kapila and Rushabh
Lathia.

With both of these patches, you can create a file that says:

force_parallel_mode=regress
max_parallel_degree=2

Then you can run: make check-world TEMP_CONFIG=/path/to/aforementioned/file

If you do, you'll find that while the core regression tests pass
(whee!) the pg_upgrade regression tests fail (oops) because of a
pre-existing bug in the parallelism code introduced by neither of
these two patches.  I'm not exactly sure how to fix that bug yet - I
have a couple of ideas - but I think the fact that this test code
promptly found a bug is good sign that it provides enough test
coverage to be useful.  Sticking a Gather node on top of every query
where it looks safe just turns out to exercise a lot of things: the
code that decides whether it's safe to put that Gather node, the code
to launch and manage parallel workers, the code those workers
themselves run, etc.  The point is just to force as much of the
parallel code to be used as possible even when it's not expected to
make anything faster.

test-group-locking-v1.patch is useful for testing possible deadlock
scenarios with the group locking patch.  It's not otherwise safe to
use this, like, at all, and the patch is not proposed for commit.
This patch is entirely by Amit Kapila.

In addition to what's in these patches, I'd like to add a new chapter
to the documentation explaining which queries can be parallelized and
in what ways, what the restrictions are that keep parallel query from
getting used, and some high-level details of how parallelism "works"
in PostgreSQL from a user perspective.  Things will obviously change
here as we get more capabilities, but I think 

Re: [HACKERS] [POC] FETCH limited by bytes.

2016-02-02 Thread Robert Haas
On Wed, Jan 27, 2016 at 11:19 PM, Corey Huinker  wrote:
> The possible tests might be:
> - creating a server/table with fetch_size set
> - altering an existing server/table to have fetch_size set
> - verifying that the value exists in pg_foreign_server.srvoptions and
> pg_foreign_table.ftoptions.
> - attempting to set fetch_size to a non-integer value

I'd add a test that does one of the first two and skip the others.
I'm not wedded to that exact thing; that's just a suggestion.

> Enjoy.

I'd enjoy it more if, heh heh, it compiled.

postgres_fdw.c:2642:16: error: use of undeclared identifier 'server'
foreach(lc, server->options)
^
../../src/include/nodes/pg_list.h:153:26: note: expanded from macro 'foreach'
for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell))
^
1 error generated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Michael Banck
On Tue, Feb 02, 2016 at 08:28:54AM -0800, Joshua D. Drake wrote:
> On 02/02/2016 07:31 AM, curtis.r...@gmail.com wrote:
> >Its not available in rpm or premade binary forms in its current
> >instantiation, not a big deal for me, but it raises the barrier to
> >entry.
> >
> >If it was packaged into an RPM, what would be the process to get it
> >added to postgresql's yum repositories?
> 
> Assuming it is not placed into core, we would need to work with the
> deb and rpm projects. Which I am sure we could (and would) do.

We are looking into packaging pgaudit for Debian.

However, then another question comes up: Should the 2nd Quadrant or the
Crunchy Data codebase be added to the distribution? Who gets to decide?

Alternatively, both could be added, but that will likely confuse users.


Michael

-- 
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax:  +49 (2161) 4643-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 5:16 PM, David Steele  wrote:
> This sort of confusion is one of the main reasons I pursued inclusion in
> core.

But that's exactly wrong.  When there is not agreement on one code
base over another, that's the time it is most important not to pick
one of them arbitrarily privilege it over the others.  The *only* time
it's appropriate to move something that could just as well as an
extension into core is when (1) we think it's highly likely that users
will want that particular thing rather than some other thing and (2)
we think it's worth the burden of maintaining that code forever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 10:39 AM, Catalin Iacob  wrote:
> On Tue, Feb 2, 2016 at 3:48 PM, Pavel Stehule  wrote:
>> If we decided to break compatibility, then we have to do explicitly. Thid
>> discussion can continue with commiter, but send path with duplicitly defined
>> functions has not sense for me. Currently I out of office, so I cannot to
>> clean it. 4.2 I should to work usually
>
> I think I didn't make myself clear so I'll try again.
>
> There are 2 options:
>
> 1. keep info etc. unchanged and add raise_info etc.
> 2. change behaviour of info etc. and of course don't add raise_*
>
> You already implemented 1. I said I think 2. is better and worth the
> compatibility break in my opinion. But the committer decides not me.
>
> Since 1. is already done, what I propose is: let's finish the other
> aspects of the patch (incorporate my docs updates and details in Error
> instead of SPIError) then I'll mark this ready for committer and
> summarize the discussion. I will say there that option 1. was
> implemented since it's backward compatible but that if the committer
> thinks option 2. is better we can change the patch to implement option
> 2. If you do the work for 2 now, the committer might still say "I want
> 1" and then you need to do more work again to go back to 1. Easier to
> just stay with 1 for now until we have committer input.

The eventual committer is likely to be much happier with this patch if
you guys have achieved consensus among yourselves on the best
approach.

(Disclaimer: The eventual committer won't be me.  I'm not a Python
guy.  But we try to proceed by consensus rather than committer-dictat
around here, when we can.  Obviously the committer has the final say
at some level, but it's better if that power doesn't need to be
exercised too often.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread David Steele
On 2/2/16 4:50 PM, Michael Banck wrote:
> 
> We are looking into packaging pgaudit for Debian.
> 
> However, then another question comes up: Should the 2nd Quadrant or the
> Crunchy Data codebase be added to the distribution? Who gets to decide?

For my 2 cents I think that the version I submitted recently is better
for PostgreSQL >= 9.5:

https://github.com/pgaudit/pgaudit

For this version I started with 2ndQuadrant's excellent work then spent
months refactoring, testing and adding additional configuration options.
 In addition a number of reviewers found issues which were fixed and in
general it went through a trial by fire.  This version includes more
comprehensive documentation and extensive regression tests.

However, the original 2ndQuadrant version will happily work with
PostgreSQL 9.3 or 9.4:

https://github.com/2ndQuadrant/pgaudit

> Alternatively, both could be added, but that will likely confuse users.

This sort of confusion is one of the main reasons I pursued inclusion in
core.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


[HACKERS] Why is hstore_to_json_loose not like hstore_to_jsonb_loose?

2016-02-02 Thread Tom Lane
hstore_to_json_loose() contains a heuristic that says that an hstore value
that looks like a JSON number should be treated as a number (and hence not
quoted).  That logic has an oversight in it, as per bug #13906, but it's
straightforward to fix.

However, I noticed that hstore_to_jsonb_loose() has completely different
(and much uglier) code to make the identical decision.  Is there a good
reason for that?  It seems to have avoided the bug in IsValidJsonNumber,
but looking at it doesn't fill me with confidence that it has no other
bugs, and anyway it's not clear why these two functions should have
different edge-case behaviors about what is a number and what is not.
So I am very sorely tempted to replace hstore_to_jsonb_loose()'s heuristic
with IsValidJsonNumber().

Comments, objections?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-02 Thread Robert Haas
On Mon, Feb 1, 2016 at 9:28 AM, Fujii Masao  wrote:
> So what about the following plan?
>
> [first version]
> Add only synchronous_standby_num which specifies the number of standbys
> that the master must wait for before marking sync replication as completed.
> This version supports simple use cases like "I want to have two synchronous
> standbys".
>
> [second version]
> Add synchronous_replication_method: 'prioriry' and 'quorum'. This version
> additionally supports simple quorum commit case like "I want to ensure
> that WAL is replicated synchronously to at least two standbys from five
> ones listed in s_s_names".
>
> Or
>
> Add something like quorum_replication_num and quorum_standby_names, i.e.,
> the master must wait for at least q_r_num standbys from ones listed in
> q_s_names before marking sync replication as completed. Also the master
> must wait for sync replication according to s_s_num and s_s_num.
> That is, this approach separates 'priority' and 'quorum' to each parameters.
> This increases the number of GUC parameters, but ISTM less confusing, and
> it supports a bit complicated case like "there is one local standby and three
> remote standbys, then I want to ensure that WAL is replicated synchronously
> to the local standby and at least two remote one", e.g.,
>
>   s_s_num = 1, s_s_names = 'local'
>   q_s_num = 2, q_s_names = 'remote1, remote2, remote3'
>
> [third version]
> Add the hooks for more complicated sync replication cases.

-1.  We're wrapping ourselves around the axle here and ending up with
a design that will not let someone say "the local standby and at least
one remote standby" without writing C code.  I understand nobody likes
the mini-language I proposed and nobody likes a JSON configuration
file either.  I also understand that either of those things would
allow ridiculously complicated configurations that nobody will ever
need in the real world.  But I think "one local and one remote" is a
fairly common case and that you shouldn't need a PhD in
PostgreSQLology to configure it.

Also, to be frank, I think we ought to be putting more effort into
another patch in this same area, specifically Thomas Munro's causal
reads patch.  I think a lot of people today are trying to use
synchronous replication to build load-balancing clusters and avoid the
problem where you write some data and then read back stale data from a
standby server.  Of course, our current synchronous replication
facilities make no such guarantees - his patch does, and I think
that's pretty important.  I'm not saying that we shouldn't do this
too, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread curtis . ruck
Its not available in rpm or premade binary forms in its current instantiation, 
not a big deal for me, but it raises the barrier to entry.

If it was packaged into an RPM, what would be the process to get it added to 
postgresql's yum repositories?

February 2 2016 10:24 AM, "Joshua D. Drake"  wrote:
> On 02/02/2016 02:47 AM, José Luis Tallón wrote:
> 
>> On 02/02/2016 02:05 AM, Curtis Ruck wrote:
>>> [snip]
>>> 
>>> P.S., do you know what sucks, having a highly performant PostGIS
>>> database that works great, and being told to move to Oracle or SQL
>>> Server (because they have auditing). Even though they charge extra
>>> for Geospatial support (seriously?) or when they don't even have
>>> geospatial support (10 years ago). My customer would prefer to
>>> re-engineer software designed around PostgreSQL and pay the overpriced
>>> licenses, than not have auditing. I agree that their cost analysis is
>>> probably way off, even 10 years later, my only solution would be to
>>> move to Oracle, SQL Server, a NoSQL solution, or pay EnterpriseDB for
>>> their 2 year old version that doesn't have all the cool/modern jsonb
>>> support.
> 
> PostgreSQL has auditing. It is available now, just not in core. Postgis isn't 
> available in core
> either and it seems to do just fine.
> 
> JD
> 
> -- Command Prompt, Inc. http://the.postgres.company
> +1-503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> Everyone appreciates your honesty, until you are honest with them.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-02 Thread Alvaro Herrera
Robert Haas wrote:

> The eventual committer is likely to be much happier with this patch if
> you guys have achieved consensus among yourselves on the best
> approach.
> 
> (Disclaimer: The eventual committer won't be me.  I'm not a Python
> guy.  But we try to proceed by consensus rather than committer-dictat
> around here, when we can.  Obviously the committer has the final say
> at some level, but it's better if that power doesn't need to be
> exercised too often.)

Actually I imagine that if there's no agreement between author and first
reviewer, there might not *be* a committer in the first place.  Perhaps
try to get someone else to think about it and make a decision.  It is
possible that some other committer is able to decide by themselves but I
wouldn't count on it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] FETCH limited by bytes.

2016-02-02 Thread Corey Huinker
>
>
> postgres_fdw.c:2642:16: error: use of undeclared identifier 'server'
> foreach(lc, server->options)
>

Odd. Compiled for me. Maybe I messed up the diff. Will get back to you on
that with the tests suggested.



> ^
> ../../src/include/nodes/pg_list.h:153:26: note: expanded from macro
> 'foreach'
> for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell))
> ^


Didn't modify this file on this or any other work of mine. Possible garbage
from a git pull. Will investigate.


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 6:04 AM, Alvaro Herrera  wrote:
> Andreas Joseph Krogh wrote:
>> Which seems to indicate it has received a fair amount of testing and is quite
>> stable.
>> Hopefully it integrates into the 9.6 codebase without too much risk.
>
> Yes, yes, that's all very good, but we're nearing the closure of the 9.6
> development cycle and we only have one commitfest left.  If someone had
> lots of community brownie points because of doing lots of reviews of
> other people's patches, they might push their luck by posting this patch
> to the final commitfest.  But if that someone didn't, then it wouldn't
> be fair, and if I were the commitfest manager of that commitfest I would
> boot their patch to the 9.7-First commitfest.

Completely agreed.  Also, to be frank, these text search patches are
often in need of quite a LOT of work per line compared to some others.
For example, consider the percentage of this patch that is comments.
It's a pretty low percentage.  And it's going into surrounding code
that is also very low in comments.  Giving this a good review will
take somebody a lot of time, and bringing it up to PostgreSQL's normal
standards will probably take quite a lot of work.  I don't understand
why the community should agree to do that for anyone, and as you say,
it's not like PostgresPro is leading the pack in terms of review
contributions.  We're not going to get this release out on time if
everybody insists that every patch has to go in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] FETCH limited by bytes.

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 5:57 PM, Corey Huinker  wrote:
>> postgres_fdw.c:2642:16: error: use of undeclared identifier 'server'
>> foreach(lc, server->options)
>
>
> Odd. Compiled for me. Maybe I messed up the diff. Will get back to you on
> that with the tests suggested.

I don't see how.  There really is no declaration in there for a
variable called server.

>> ../../src/include/nodes/pg_list.h:153:26: note: expanded from macro
>> 'foreach'
>> for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell))
>> ^
>
> Didn't modify this file on this or any other work of mine. Possible garbage
> from a git pull. Will investigate.

This is context information for the same error, not a separate problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-02 Thread Fujii Masao
On Wed, Feb 3, 2016 at 7:33 AM, Robert Haas  wrote:
> On Mon, Feb 1, 2016 at 9:28 AM, Fujii Masao  wrote:
>> So what about the following plan?
>>
>> [first version]
>> Add only synchronous_standby_num which specifies the number of standbys
>> that the master must wait for before marking sync replication as completed.
>> This version supports simple use cases like "I want to have two synchronous
>> standbys".
>>
>> [second version]
>> Add synchronous_replication_method: 'prioriry' and 'quorum'. This version
>> additionally supports simple quorum commit case like "I want to ensure
>> that WAL is replicated synchronously to at least two standbys from five
>> ones listed in s_s_names".
>>
>> Or
>>
>> Add something like quorum_replication_num and quorum_standby_names, i.e.,
>> the master must wait for at least q_r_num standbys from ones listed in
>> q_s_names before marking sync replication as completed. Also the master
>> must wait for sync replication according to s_s_num and s_s_num.
>> That is, this approach separates 'priority' and 'quorum' to each parameters.
>> This increases the number of GUC parameters, but ISTM less confusing, and
>> it supports a bit complicated case like "there is one local standby and three
>> remote standbys, then I want to ensure that WAL is replicated synchronously
>> to the local standby and at least two remote one", e.g.,
>>
>>   s_s_num = 1, s_s_names = 'local'
>>   q_s_num = 2, q_s_names = 'remote1, remote2, remote3'
>>
>> [third version]
>> Add the hooks for more complicated sync replication cases.
>
> -1.  We're wrapping ourselves around the axle here and ending up with
> a design that will not let someone say "the local standby and at least
> one remote standby" without writing C code.  I understand nobody likes
> the mini-language I proposed and nobody likes a JSON configuration
> file either.  I also understand that either of those things would
> allow ridiculously complicated configurations that nobody will ever
> need in the real world.  But I think "one local and one remote" is a
> fairly common case and that you shouldn't need a PhD in
> PostgreSQLology to configure it.

So you disagree with only third version that I proposed, i.e.,
adding some hooks for sync replication? If yes and you're OK
with the first and second versions, ISTM that we almost reached
consensus on the direction of multiple sync replication feature.
The first version can cover "one local and one remote sync standbys" case,
and the second can cover "one local and at least one from several remote
standbys" case. I'm thinking to focus on the first version now,
and then we can work on the second to support the quorum commit

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Integer overflow in timestamp_part()

2016-02-02 Thread Tom Lane
[ Please use a useful Subject: line in your posts. ]

Vitaly Burovoy  writes:
> I've just found a little bug: extracting "epoch" from the last 30
> years before Postgres' "+Infinity" leads an integer overflow:

Hmm.  I do not like the proposed patch much: it looks like it's
throwing away precision too soon, although given that the result of
SetEpochTimestamp can be cast to float exactly, maybe it doesn't matter.

More importantly, I seriously doubt that this is the only issue
for timestamps very close to the INT64_MAX boundary.  An example is
that we're not rejecting values that would correspond to DT_NOBEGIN
or DT_NOEND:

regression=# set timezone = 'PST8PDT';
SET
regression=# select '294277-01-08 20:00:54.775806-08'::timestamptz;
   timestamptz   
-
 294277-01-08 20:00:54.775806-08
(1 row)

regression=# select '294277-01-08 20:00:54.775807-08'::timestamptz;
 timestamptz 
-
 infinity
(1 row)

regression=# select '294277-01-08 20:00:54.775808-08'::timestamptz;
 timestamptz 
-
 -infinity
(1 row)

regression=# select '294277-01-08 20:00:54.775809-08'::timestamptz;
ERROR:  timestamp out of range

Worse yet, that last error is coming from timestamptz_out, not
timestamptz_in; we're accepting a value we cannot store properly.
The converted value has actually overflowed to be equal to
INT64_MIN+1, and then timestamptz_out barfs because it's before
Julian day 0.  Other operations would incorrectly interpret it
as a date in the very far past.  timestamptz_in doesn't throw an
error until several hours later than this; it looks like the
problem is that tm2timestamp() worries about overflow in initially
calculating the converted value, but not about overflow in the
dt2local() rotation, and in any case it doesn't worry about not
producing DT_NOEND.

I'm inclined to think that a good solution would be to create an
artificial restriction to not accept years beyond, say, 10 AD.
That would leave us with a lot of daylight to not have to worry
about corner-case overflows in timestamp arithmetic.  I'm not sure
though where we'd need to enforce such a restriction; certainly in
timestamp[tz]_in, but where else?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Noah Misch
On Tue, Feb 02, 2016 at 12:24:50PM +0100, Andres Freund wrote:
> On 2016-02-01 23:16:16 -0500, Noah Misch wrote:
> > On Tue, Feb 02, 2016 at 01:13:20AM +0100, Andres Freund wrote:
> > > I'm not sure what'd actually be a good upper limit. I'd be inclined to
> > > even go to as high as a week or so. A lot of our settings have
> > > upper/lower limits that aren't a good idea in general.
> > 
> > In general, I favor having limits reflect fundamental system limitations
> > rather than paternalism.  Therefore, I would allow INT_MAX (68 years).
> 
> I generally incree with that attitude - I'm disinclined to go just that
> high though. Going close to INT_MAX means having to care about overflow
> in trivial computations, in a scenario we're unlikely to ever
> test. Sure, we can use a debugger to adjust time or accellerate time
> progress, but that's all unrealistic if we're honest.  So maybe go with
> a year?

Okay.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Integer overflow in timestamp_part()

2016-02-02 Thread Jim Nasby

On 2/2/16 6:39 PM, Tom Lane wrote:

I'm inclined to think that a good solution would be to create an
artificial restriction to not accept years beyond, say, 10 AD.
That would leave us with a lot of daylight to not have to worry
about corner-case overflows in timestamp arithmetic.  I'm not sure
though where we'd need to enforce such a restriction; certainly in
timestamp[tz]_in, but where else?


Probably some of the casts (I'd think at least timestamp->timestamptz). 
Maybe timestamp[tz]_recv. Most of the time*pl* functions. :/

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Jim Nasby

On 2/2/16 7:25 PM, Curtis Ruck wrote:

I'm opening to testing and evaluating to see if it meets our compliance
requirements, but I am no where close to being a C developer, or having
C developers that could actually provide a meaningful review.  One issue
along this thread already pops up, concerning the client_min_messages,
and how other patches in the pipeline for 9.6 would be required to
enable the auditing to meet compliance requirements.


There's other ways you can help besides reviewing. Providing real-world 
use cases helps. Even better is maintaining things on the wiki that 
assist with moving things forward (use cases, discussion/decision 
highlights, really anything that helps move the discussion).



It just seems after reading the mailing list history, that there is a
lack of interest by people with commit authority, even though there is a
decent interest in it from the community, and honestly, no one really
likes auditing, but its one of those damned if you do (in performance)
and damned if you don't (in legal) things.


Yeah, no one that's volunteering time (as opposed to being paid to work 
on PG) is going to pick up something as unsexy and painful to deal with 
as auditing.



Additionally Robert, given your professional status, you are by no means
an unbiased contributor in this discussion.  Your stance on this matter
shows that you don't necessarily want the open source solution to
succeed in the commercial/compliance required space.  Instead of arguing


I'm sorry, but that's just ridiculous, and I completely agree with 
Robert's initial sentiment: there needs to be a damn good reason for the 
community to pick one specific implementation of something when there 
are competing solutions.



blankly against inclusion can you at least provide actionable based
feedback that if met would allow patches of this magnitude in?


It works just like any other patch does: the community has to come to a 
*consensus* that not only is the feature desired and well designed, but 
that the implementation is high quality. I haven't followed the auditing 
discussions closely, but it seems that there are still questions around 
how the feature should work.



I'm personally fine with fiscally rewarding organizations that assist my
customer in succeeding, but its hard to convince my customer to fund
open source, even though they wouldn't be able to do 75% of what they do
without it.  Based on past experience this is the same most open source
organizations face, especially when they don't have the marketing engine
that the large commercial players have.


I really don't understand that, given what most of the alternative 
solutions cost. If they balk at putting money towards developing 
Postgres they really need to get a quote for running the same amount of 
MSSQL (let alone Oracle, which is even more expensive).


I do think the community could do a better job of at least encouraging 
companies to fund development. Unfortunately there's always going to be 
some amount of friction here though, because of the question of how to 
allocate funds to the different companies that are involved. Another 
problem is no commercial company can actually guarantee anything will 
make it into community Postgres, and it's very difficult to even 
estimate the amount of effort (read as: what to charge) for getting a 
feature committed.


Commercial development is certainly possible though. 2nd Quadrant was 
able to raise a good amount of money to fund the development of hot 
standby. IIRC that was before sites like kickstarter existed too, so it 
would probably be even easier to do today.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Re-Writes

2016-02-02 Thread Amit Kapila
On Mon, Feb 1, 2016 at 8:05 PM, Jim Nasby  wrote:

> On 1/31/16 3:26 PM, Jan Wieck wrote:
>
>> On 01/27/2016 08:30 AM, Amit Kapila wrote:
>>
>>> operation.  Now why OS couldn't find the corresponding block in
>>> memory is that, while closing the WAL file, we use
>>> POSIX_FADV_DONTNEED if wal_level is less than 'archive' which
>>> lead to this problem.  So with this experiment, the conclusion is that
>>> though we can avoid re-write of WAL data by doing exact writes, but
>>> it could lead to significant reduction in TPS.
>>>
>>
>> POSIX_FADV_DONTNEED isn't the only way how those blocks would vanish
>> from OS buffers. If I am not mistaken we recycle WAL segments in a round
>> robin fashion. In a properly configured system, where the reason for a
>> checkpoint is usually "time" rather than "xlog", a recycled WAL file
>> written to had been closed and not touched for about a complete
>> checkpoint_timeout or longer. You must have a really big amount of spare
>> RAM in the machine to still find those blocks in memory. Basically we
>> are talking about the active portion of your database, shared buffers,
>> the sum of all process local memory and the complete pg_xlog directory
>> content fitting into RAM.
>>
>

I think that could only be problem if reads were happening at write or
fsync call, but that is not the case here.  Further investigation on this
point reveals that the reads are not for fsync operation, rather they
happen when we call posix_fadvise(,,POSIX_FADV_DONTNEED).
Although this behaviour (writing in non-OS-page-cache-size chunks could
lead to reads if followed by a call to posix_fadvise
(,,POSIX_FADV_DONTNEED)) is not very clearly documented, but the
reason for the same is that fadvise() call maps the specified data range
(which in our case is whole file) into the list of pages and then invalidate
them which will further lead to removing them from OS cache, now any
misaligned (w.r.t OS page-size) writes done during writing/fsyncing to file
could cause additional reads as everything written by us will not be on
OS-page-boundary. This theory is based on code of fadvise [1] and some
googling [2] which suggests that misaligned reads followed with
POSIX_FADV_DONTNEED could cause similar problem.  Colleague of
mine, Dilip Kumar has verified it even by writing a simple program
for open/write/fsync/fdvise/close as well.


>
> But that's only going to matter when the segment is newly recycled. My
> impression from Amit's email is that the OS was repeatedly reading even in
> the same segment?
>
>
As explained above the reads are only happening during file close.


> Either way, I would think it wouldn't be hard to work around this by
> spewing out a bunch of zeros to the OS in advance of where we actually need
> to write, preventing the need for reading back from disk.
>
>
I think we can simply prohibit to set wal_chunk_size to a value other
than OS-page-cache or XLOG_BLCKSZ (whichever is lesser) if the
wal_level is lesser than archive. This can avoid the problem of extra
reads for misaligned writes as we won't call fadvise().

We can even choose to always write in OS-page-cache boundary
or XLOG_BLCKSZ (whichever is lesser) as in many cases
OS-page-cache boundary is 4K which can also save significant
re-writes.



> Amit, did you do performance testing with archiving enabled an a no-op
> archive_command?
>

No, but what kind of advantage are you expecting from such
tests?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

2016-02-02 Thread Noah Misch
On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:
> On 12/22/2015 03:49 PM, Noah Misch wrote:
> >On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote:
> >>I have pushed it now.  Further testing, of course, is always welcome.
> >
> >While investigating stats.sql buildfarm failures, mostly on animals axolotl
> >and shearwater, I found that this patch (commit 187492b) inadvertently 
> >removed
> >the collector's ability to coalesce inquiries.  Every PGSTAT_MTYPE_INQUIRY
> >received now causes one stats file write.  Before, pgstat_recv_inquiry() did:
> >
> > if (msg->inquiry_time > last_statrequest)
> > last_statrequest = msg->inquiry_time;
> >
> >and pgstat_write_statsfile() did:
> >
> > globalStats.stats_timestamp = GetCurrentTimestamp();
> >... (work of writing a stats file) ...
> > last_statwrite = globalStats.stats_timestamp;
> > last_statrequest = last_statwrite;
> >
> >If the collector entered pgstat_write_statsfile() with more inquiries waiting
> >in its socket receive buffer, it would ignore them as being too old once it
> >finished the write and resumed message processing.  Commit 187492b converted
> >last_statrequest to a "last_statrequests" list that we wipe after each write.
> 
> So essentially we remove the list of requests, and thus on the next round we
> don't know the timestamp of the last request and write the file again
> unnecessarily. Do I get that right?

Essentially right.  Specifically, for each database, we must remember the
globalStats.stats_timestamp of the most recent write.  It could be okay to
forget the last request timestamp.  (I now doubt I picked the best lines to
quote, above.)

> What if we instead kept the list but marked the requests as 'invalid' so
> that we know the timestamp? In that case we'd be able to do pretty much
> exactly what the original code did (but at per-db granularity).

The most natural translation of the old code would be to add a write_time
field to struct DBWriteRequest.  One can infer "invalid" from write_time and
request_time.  There are other reasonable designs, though.

> We'd have to cleanup the list once in a while not to grow excessively large,
> but something like removing entries older than PGSTAT_STAT_INTERVAL should
> be enough.

Specifically, if you assume the socket delivers messages in the order sent,
you may as well discard entries having write_time at least
PGSTAT_STAT_INTERVAL older than the most recent cutoff_time seen in a
PgStat_MsgInquiry.  That delivery order assumption does not hold in general,
but I expect it's close enough for this purpose.

> >As for how to fix this, the most direct translation of the old logic is to
> >retain last_statrequests entries that could help coalesce inquiries.I lean
> >toward that for an initial, back-patched fix.
> 
> That seems reasonable and I believe it's pretty much the idea I came up with
> above, right? Depending on how you define "entries that could help coalesce
> inquiries".

Yes.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor code improvements to create_foreignscan_plan/ExecInitForeignScan

2016-02-02 Thread Alvaro Herrera
Etsuro Fujita wrote:

> Done.  Attached is an updated version of the patch.

Pushed, thanks.

I kinda wonder why this struct member has a name that doesn't match the
naming convention in the rest of the struct, and also why isn't it
documented in the comment above the struct definition.  But that's not
this patch's fault.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Add generate_series(date, date) and generate_series(date, date, integer)

2016-02-02 Thread David Steele
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Everything looks good to me.  Ready for a committer to have a look.

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 10:58 PM, Tom Lane  wrote:
> I've gotta go with the "paternalism" side of the argument here.  Suppose
> you configure your system to checkpoint once a year --- what is going to
> happen when the year is up?  Or when you try to shut it down?  You *will*
> regret such a setting.
>
> I don't think we should allow the checkpoint distances to be so large that
> checkpoints don't happen in the normal course of events.  I'd be okay with
> the max being a day, perhaps.

If smart people[1] want to set checkpoint_timeout to a value higher
than 1 day[2], then I think we should let them.

I think what will happen if you set checkpoint_timeout to 1 year is
that you will checkpoint solely based on WAL volume, which does not
seem like a manifestly unreasonable thing to want.  It's true that if
you set BOTH max_wal_size AND checkpoint_timeout to $WAYTOOBIG then
something bad might happen to you, but even such configurations are
actually not totally crazy: for example, you could ingest data into a
temporary PostgreSQL instance and then do logical replication from
there to another cluster for permanent storage.  You don't really need
recovery or shutdown to happen in the lifetime of the cluster, so no
harm, no foul.  Now, you could also set such configuration settings in
a situation where it will not work out well.  But that is true of most
configuration settings.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] like Andres
[2] see original post


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-02 Thread Masahiko Sawada
On Tue, Feb 2, 2016 at 11:42 AM, Masahiko Sawada  wrote:
> On Tue, Feb 2, 2016 at 10:15 AM, Jim Nasby  wrote:
>> On 2/1/16 4:59 PM, Alvaro Herrera wrote:
>>>
>>> Masahiko Sawada wrote:
>>>
 Attached updated version patch.
 Please review it.
>>>
>>>
>>> In pg_upgrade, the "page convert" functionality is there to abstract
>>> rewrites of pages being copied; your patch is circumventing it and
>>> AFAICS it makes the interface more complicated for no good reason.  I
>>> think the real way to do that is to write your rewriteVisibilityMap as a
>>> pageConverter routine.  That should reduce some duplication there.
>>
>
> This means that user always have to set pageConverter plug-in when upgrading?
> I was thinking that this conversion is required for all user who wants
> to upgrade to 9.6, so we should support it in core, not as a plug-in.

I misunderstood. Sorry for noise.
I agree with adding conversion method as a pageConverter routine.

This patch doesn't change page layout actually, but pageConverter
routine checks only the page layout.
And we have to plugin named convertLayout_X_to_Y.

I think we have two options.

1. Change page layout(PG_PAGE_LAYOUT_VERSION) to 5. pg_upgrade detects
it and then converts only VM files.
2. Change pg_upgrade plugin mechanism so that it can handle other name
conversion plugins (e.g., convertLayout_vm_to_vfm)

I think #2 is better. Thought?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Oleg Bartunov
On Tue, Feb 2, 2016 at 10:18 AM, Andreas Joseph Krogh 
wrote:

> På tirsdag 02. februar 2016 kl. 04:22:57, skrev Michael Paquier <
> michael.paqu...@gmail.com>:
>
>
>
> On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov 
> wrote:
>>
>> This patch was originally developed by Teodor Sigaev and Oleg Bartunov in
>> 2009, so all credit goes to them. Any feedback is welcome.
>
> This is not a small patch:
> 28 files changed, 2441 insertions(+), 380 deletions(-)
> And the last CF of 9.6 should not contain rather large patches.
> --
> Michael
>
>
>
> OTOH; It would be extremely nice to get this into 9.6.
>

will see how community decided.
anyway, it's already in our distribution.



>
>
> --
> *Andreas Joseph Krogh*
>


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Andreas Joseph Krogh
På tirsdag 02. februar 2016 kl. 09:20:06, skrev Oleg Bartunov <
obartu...@gmail.com >:
    On Tue, Feb 2, 2016 at 10:18 AM, Andreas Joseph Krogh > wrote: På tirsdag 02. februar 2016 kl. 04:22:57, 
skrev Michael Paquier >:
    On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov > wrote: This patch was originally developed by 
Teodor Sigaev and Oleg Bartunov in
 2009, so all credit goes to them. Any feedback is welcome. 

This is not a small patch:
 28 files changed, 2441 insertions(+), 380 deletions(-)
And the last CF of 9.6 should not contain rather large patches.
-- Michael


 
 


OTOH; It would be extremely nice to get this into 9.6.
 
will see how community decided.
anyway, it's already in our distribution.



 
 
Which seems to indicate it has received a fair amount of testing and is quite 
stable.
Hopefully it integrates into the 9.6 codebase without too much risk.
Thanks for contributing this.
 
-- Andreas Joseph Krogh




Re: [HACKERS] pg_lsn cast to/from int8

2016-02-02 Thread Magnus Hagander
On Tue, Jan 26, 2016 at 4:58 PM, Craig Ringer  wrote:

> On 26 January 2016 at 22:07, Magnus Hagander  wrote:
>
>
>> In this case, mostly legacy compatibility. Making an app that works with
>> versions that don't have pg_lsn have a nice path forward to the modern
>> world. Being able to cast from pg_lsn to int8 can also make it easier to
>> work with the values in the client application, though I don't need that
>> for this particular one.
>>
>>
> Wouldn't we need a uint8 type for that?
>
> I guess we could just show people negative LSNs if the high bit is set
> (that being rather unlikely) but still...
>


Yes, in theory. Though the likelihood of actually reaching that... It would
probably be OK to just throw an error if the high bit is actually set.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-02 Thread Masahiko Sawada
On Mon, Feb 1, 2016 at 11:28 PM, Fujii Masao  wrote:
> On Mon, Feb 1, 2016 at 5:36 PM, Masahiko Sawada  wrote:
>> On Sun, Jan 31, 2016 at 8:58 PM, Michael Paquier
>>  wrote:
>>> On Sun, Jan 31, 2016 at 5:28 PM, Masahiko Sawada  
>>> wrote:
 On Sun, Jan 31, 2016 at 5:18 PM, Michael Paquier
  wrote:
> On Sun, Jan 31, 2016 at 5:08 PM, Masahiko Sawada  
> wrote:
>> On Sun, Jan 31, 2016 at 1:17 PM, Michael Paquier
>>  wrote:
>>> On Thu, Jan 28, 2016 at 10:10 PM, Masahiko Sawada wrote:
 By the discussions so far, I'm planning to have several replication
 methods such as 'quorum', 'complex' in the feature, and the each
 replication method specifies the syntax of s_s_names.
 It means that s_s_names could have the number of sync standbys like
 what current patch does.
>>>
>>> What if the application_name of a standby node has the format of an 
>>> integer?
>>
>> Even if the standby has an integer as application_name, we can set
>> s_s_names like '2,1,2,3'.
>> The leading '2' is always handled as the number of sync standbys when
>> s_r_method = 'priority'.
>
> Hm. I agree with Fujii-san here, having the number of sync standbys
> defined in a parameter that should have a list of names is a bit
> confusing. I'd rather have a separate GUC, which brings us back to one
> of the first patches that I came up with, and a couple of people,
> including Josh were not happy with that because this did not support
> real quorum. Perhaps the final answer would be really to get a set of
> hooks, and a contrib module making use of that.

 Yeah, I agree with having set of hooks, and postgres core has simple
 multi sync replication mechanism like you suggested at first version.
>>>
>>> If there are hooks, I don't think that we should really bother about
>>> having in core anything more complicated than what we have now. The
>>> trick will be to come up with a hook design modular enough to support
>>> the kind of configurations mentioned on this thread. Roughly perhaps a
>>> refactoring of the syncrep code so as it is possible to wait for
>>> multiple targets some of them being optional,, one modular way in
>>> pg_stat_get_wal_senders to represent the status of a node to user, and
>>> another hook to return to decide which are the nodes to wait for. Some
>>> of the nodes being waited for may be based on conditions for quorum
>>> support. That's a hard problem to do that in a flexible enough way.
>>
>> Hm, I think not-nested quorum and priority are not complicated, and we
>> should support at least both or either simple method in core of
>> postgres.
>> More complicated method like using json-style, or dedicated language
>> would be supported by external module.
>
> So what about the following plan?
>
> [first version]
> Add only synchronous_standby_num which specifies the number of standbys
> that the master must wait for before marking sync replication as completed.
> This version supports simple use cases like "I want to have two synchronous
> standbys".
>
> [second version]
> Add synchronous_replication_method: 'prioriry' and 'quorum'. This version
> additionally supports simple quorum commit case like "I want to ensure
> that WAL is replicated synchronously to at least two standbys from five
> ones listed in s_s_names".
>
> Or
>
> Add something like quorum_replication_num and quorum_standby_names, i.e.,
> the master must wait for at least q_r_num standbys from ones listed in
> q_s_names before marking sync replication as completed. Also the master
> must wait for sync replication according to s_s_num and s_s_num.
> That is, this approach separates 'priority' and 'quorum' to each parameters.
> This increases the number of GUC parameters, but ISTM less confusing, and
> it supports a bit complicated case like "there is one local standby and three
> remote standbys, then I want to ensure that WAL is replicated synchronously
> to the local standby and at least two remote one", e.g.,
>
>   s_s_num = 1, s_s_names = 'local'
>   q_s_num = 2, q_s_names = 'remote1, remote2, remote3'
>
> [third version]
> Add the hooks for more complicated sync replication cases.
>
> I'm thinking that the realistic target for 9.6 might be the first one.
>

Thank you for suggestion.

I agree with first version, and attached the updated patch which are
modified so that it supports simple multiple sync replication you
suggested.
(but test cases are not included yet.)

Regards,

--
Masahiko Sawada
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index 7f85b88..9a2f7e7 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -29,10 +29,10 @@
  * single ordered queue of waiting backends, 

Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-02 Thread Pavel Stehule
Dne 2. 2. 2016 7:30 napsal uživatel "Catalin Iacob" :
>
> On Mon, Feb 1, 2016 at 5:37 PM, Pavel Stehule 
wrote:
> > Dne 29. 1. 2016 18:09 napsal uživatel "Catalin Iacob"
> > :
> >> Looking at the output above, I don't see who would rely on calling
> >> plpy.error with multiple arguments and getting the tuple so I'm
> >> actually in favor of just breaking backward compatibility. Note that
> >> passing multiple arguments isn't even documented. So I would just
> >> change debug, info, error and friends to do what raise_debug,
> >> raise_info, raise_error do. With a single argument behavior stays the
> >> same, with multiple arguments one gets more useful behavior (detail,
> >> hint) instead of the useless tuple. That's my preference but we can
> >> leave the patch with raise and leave the decision to the committer.
> >>
> >
> > if breaking compatibility, then raise* functions are useless, and
should be
> > removed.
>
> Indeed. I think it's better to change the existing functions and break
> compatibility instead of adding the raise_ functions. But the
> committer will decide if that's what should be done. Since you wrote
> the patch with raise_* I propose you keep it that way for now and let
> the committer decide. I wrote the doc patch based on raise_* as well.

If we decided to break compatibility, then we have to do explicitly. Thid
discussion can continue with commiter, but send path with duplicitly
defined functions has not sense for me. Currently I out of office, so I
cannot to clean it. 4.2 I should to work usually

>
> Attached is the doc patch (made on top of your patch). I'll wait for
> you to combine them and switch to raising Error and then hopefully
> this is ready for a committer to look at.


Re: [HACKERS] Relation extension scalability

2016-02-02 Thread Robert Haas
On Thu, Jan 28, 2016 at 6:23 AM, Dilip Kumar  wrote:
> [ new patch ]

This patch contains a useless hunk and also code not in PostgreSQL
style.  Get pgindent set up and it will do it correctly for you, or
look at the style of the surrounding code.

What I'm a bit murky about is *why* this should be any better than the
status quo.  I mean, the obvious explanation that pops to mind is that
extending the relation by two pages at a time relieves pressure on the
relation extension lock somehow.  One other possible explanation is
that calling RecordPageWithFreeSpace() allows multiple backends to get
access to that page at the same time, while otherwise each backend
would try to conduct a separate extension.  But in the first case,
what we ought to do is try to make the locking more efficient; and in
the second case, we might want to think about recording the first page
in the free space map too.  I don't think the problem here is that w

Here's a sketch of another approach to this problem.  Get rid of the
relation extension lock.  Instead, have an array of, say, 256 lwlocks.
Each one protects the extension of relations where hash(relfilenode) %
256 maps to that lock.  To extend a relation, grab the corresponding
lwlock, do the work, then release the lwlock.  You might occasionally
have a situation where two relations are both being extended very
quickly and happen to map to the same bucket, but that shouldn't be
much of a problem in practice, and the way we're doing it presently is
worse, not better, since two relation extension locks may very easily
map to the same lock manager partition.  The only problem with this is
that acquiring an LWLock holds off interrupts, and we don't want
interrupts to be locked out across a potentially lengthy I/O.  We
could partially fix that if we call RESUME_INTERRUPTS() after
acquiring the lock and HOLD_INTERRUPTS() just before releasing it, but
there's still the problem that you might block non-interruptibly while
somebody else has the lock.  I don't see an easy solution to that
problem right off-hand, but if something like this performs well we
can probably conjure up some solution to that problem.

I'm not saying that we need to do that exact thing - but I am saying
that I don't think we can proceed with an approach like this without
first understanding why it works and whether there's some other way
that might be better to address the underlying problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 9:04 AM, Alvaro Herrera  wrote:
> Robert Haas wrote:
>> Overall, I think this is on the right track, but it still needs some
>> work to make it cleaner.
>
> We've committed a large number of patches in this item this cycle.  I
> think it's fair to mark it as Committed.  Can somebody submit a new one
> to the next commitfest?

I plan to keep working with the patch authors on this patch during the
inter-CF period, but have no objection to it being considered as
Committed for CF purposes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Joshua D. Drake

On 02/02/2016 02:47 AM, José Luis Tallón wrote:

On 02/02/2016 02:05 AM, Curtis Ruck wrote:

[snip]

P.S., do you know what sucks, having a highly performant PostGIS
database that works great, and being told to move to Oracle or SQL
Server (because they have auditing).  Even though they charge extra
for Geospatial support (seriously?) or when they don't even have
geospatial support (10 years ago).  My customer would prefer to
re-engineer software designed around PostgreSQL and pay the overpriced
licenses, than not have auditing.  I agree that their cost analysis is
probably way off, even 10 years later, my only solution would be to
move to Oracle, SQL Server, a NoSQL solution, or pay EnterpriseDB for
their 2 year old version that doesn't have all the cool/modern jsonb
support.


PostgreSQL has auditing. It is available now, just not in core. Postgis 
isn't available in core either and it seems to do just fine.


JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Dave Page
On Tue, Feb 2, 2016 at 1:05 AM, Curtis Ruck
 wrote:
> or pay
> EnterpriseDB for their 2 year old version that doesn't have all the
> cool/modern jsonb support.

Just for the record, anyone who pays for our "2 year old version that
doesn't have all the cool/modern jsonb support" is also entitled to
use either our 9.4 or 9.5 versions which do have JSONB support. Our
new versions are typically released within a couple of months of
PostgreSQL, and in the case of 9.5, the gap was more like 3 weeks.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] get current log file

2016-02-02 Thread Armor
Hello,


I find there is a new feature about getting current log file name on the 
TODO list (for detail please check 
http://www.postgresql.org/message-id/pine.gso.4.64.0811101325260.9...@westnet.com).
 On the other side, we finish a ticket to this requirement for our customer. 
If the PG community still need this feature,  there will be a pleasure for 
us to make contribution. 


--
Jerry Yu

Re: [HACKERS] get current log file

2016-02-02 Thread Alvaro Herrera
Armor wrote:
> Hello,
> 
> 
> I find there is a new feature about getting current log file name on the 
> TODO list (for detail please check 
> http://www.postgresql.org/message-id/pine.gso.4.64.0811101325260.9...@westnet.com).
>  On the other side, we finish a ticket to this requirement for our customer. 
> If the PG community still need this feature,  there will be a pleasure 
> for us to make contribution. 

Please propose a design and we'll discuss.  There's clearly need for
this feature.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2016-02-02 Thread Kyotaro HORIGUCHI
Hello, thank you, Febien, Micael.

# Though I have made almost no activity in the last month...

At Tue, 26 Jan 2016 13:53:33 +0100 (CET), Fabien COELHO  
wrote in 
> 
> Hello Kyotaro-san,
> 
> > Thank you very much Michael but the CF app doesn't allow me to
> > regsiter new one. Filling the Description field with "pgbench -
> > allow backslash-continuations in custom scripts" and chose a
> > topic then "Find thread" shows nothing. Filling the search text
> > field on the "Attach thread" dialogue with the description or
> > giving the exact message-id gave me nothing to choose.
> 
> Strange.
> 
> You could try taking the old entry and selecting state "move to next
> CF"?

Hmm. The state of the old entry in CF2015-11 is already "Move*d*
to next CF" and it is not found in CF2016-01, as far as I saw.

At Tue, 26 Jan 2016 22:21:49 +0900, Michael Paquier  
wrote in 
> > Filling the Description field with "pgbench -
> > allow backslash-continuations in custom scripts" and chose a
> > topic then "Find thread" shows nothing. Filling the search text
> > field on the "Attach thread" dialogue with the description or
> > giving the exact message-id gave me nothing to choose.
> 
> Really? That's because the patch is marked as returned with feedback here:
> https://commitfest.postgresql.org/7/319/

Ah, I have many candidates in "Attach thread" dialog. That would
be a temporary symptom of a kind of the CF-seaon-wise
meaintenance.

> > Maybe should I repost the patch so that the "Attach thread" can
> > find it as a "recent" email?
> 
> What if you just add it to next CF with a new entry? You are actually
> proposing an entirely new patch.

So, I finally could register an entry for CF2016-3.
Thank you all for the suggestion.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-02 Thread Alvaro Herrera
Masahiko Sawada wrote:

> I misunderstood. Sorry for noise.
> I agree with adding conversion method as a pageConverter routine.

\o/

> This patch doesn't change page layout actually, but pageConverter
> routine checks only the page layout.
> And we have to plugin named convertLayout_X_to_Y.
> 
> I think we have two options.
> 
> 1. Change page layout(PG_PAGE_LAYOUT_VERSION) to 5. pg_upgrade detects
> it and then converts only VM files.
> 2. Change pg_upgrade plugin mechanism so that it can handle other name
> conversion plugins (e.g., convertLayout_vm_to_vfm)
> 
> I think #2 is better. Thought?

My vote is for #2 as well.  Maybe we just didn't have forks when this
functionality was invented; maybe the author just didn't think hard
enough about what would be the right interface to do it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2016-02-02 Thread Alexander Korotkov
On Tue, Feb 2, 2016 at 12:43 AM, Andres Freund  wrote:

> On 2016-02-01 13:06:57 +0300, Alexander Korotkov wrote:
> > On Mon, Feb 1, 2016 at 11:34 AM, Alexander Korotkov <
> > a.korot...@postgrespro.ru> wrote:
> > >> ClientBasePatch
> > >> 11974419382
> > >> 8125923126395
> > >> 3231393151
> > >> 64387339496830
> > >> 128306412350610
> > >>
> > >> Shared Buffer= 512MB
> > >> max_connections=150
> > >> Scale Factor=300
> > >>
> > >> ./pgbench  -j$ -c$ -T300 -M prepared -S postgres
> > >>
> > >> ClientBasePatch
> > >> 11716916454
> > >> 8108547105559
> > >> 32241619262818
> > >> 64206868233606
> > >> 128137084217013
>
> So, there's a small regression on low client counts. That's worth
> addressing.
>

Interesting. I'll try to reproduce it.


> > Attached patch is rebased and have better comments.
> > Also, there is one comment which survive since original version by
> Andres.
> >
> > /* Add exponential backoff? Should seldomly be contended tho. */
> >
> >
> > Andres, did you mean we should twice the delay with each unsuccessful try
> > to lock?
>
> Spinning on a lock as fast as possible leads to rapid cacheline bouncing
> without anybody making progress. See s_lock() in s_lock.c.
>

I didn't notice that s_lock() behaves so. Thank you.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Fix KNN GiST ordering type

2016-02-02 Thread Alexander Korotkov
On Mon, Feb 1, 2016 at 7:31 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> KNN GiST detects which type it should return by returning type of ordering
> operator.
> But it appears that type of sk_func is detected after it was replaced with
> distance function. That is wrong: distance function should always return
> float8.
> I think it is just a typo.
>

I found this was introduced by this commit.
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=821b821a2421beaa58225ff000833df69fb962c5;hp=284bef297733e553c73f1c858e0ce1532f754d18

However, commit message doesn't say anything special about this change.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Simon Riggs
On 2 February 2016 at 05:54, Michael Paquier 
wrote:

> On Tue, Feb 2, 2016 at 1:16 PM, Noah Misch  wrote:
> > On Tue, Feb 02, 2016 at 01:13:20AM +0100, Andres Freund wrote:
> >> is there any reason for the rather arbitrary and low checkpoint_timeout
> >> limit?
> >
> > Not that I know, and it is inconvenient.
> >
> >> I'm not sure what'd actually be a good upper limit. I'd be inclined to
> >> even go to as high as a week or so. A lot of our settings have
> >> upper/lower limits that aren't a good idea in general.
> >
> > In general, I favor having limits reflect fundamental system limitations
> > rather than paternalism.  Therefore, I would allow INT_MAX (68 years).
>
> +1. This way users can play as they wish.
>

If people wish to turn off crash recovery, they can already set fsync=off.
I don't wish to see us support a setting that causes problems for people
that don't understand what checkpoints are and why everybody needs them.

The current code needs to act differently with regard to very low settings,
so when we are a small number of blocks remaining we don't spend hours
performing them. Allowing very large values would make that even more
strange.

I would put a limit of 100,000 seconds = 27 hours.

Some systems offer a recovery_time_objective setting that is used to
control how frequently checkpoints occur. That might be a more usable
approach.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread José Luis Tallón

On 02/02/2016 02:05 AM, Curtis Ruck wrote:

[snip]

P.S., do you know what sucks, having a highly performant PostGIS 
database that works great, and being told to move to Oracle or SQL 
Server (because they have auditing).  Even though they charge extra 
for Geospatial support (seriously?) or when they don't even have 
geospatial support (10 years ago).  My customer would prefer to 
re-engineer software designed around PostgreSQL and pay the overpriced 
licenses, than not have auditing.  I agree that their cost analysis is 
probably way off, even 10 years later, my only solution would be to 
move to Oracle, SQL Server, a NoSQL solution, or pay EnterpriseDB for 
their 2 year old version that doesn't have all the cool/modern jsonb 
support.


Huh?  PPAS 9.5.0.5 is already out there since at least last week; Before 
that PPAS 9.4.5.y or so was there ...

(Not affiliated with EDB, but precision is important)

I agree that auditing is a big selling point and frequently used... But 
it's got to be done "the Postgres way", and that takes time (and usually 
provides superior results).



Just my .02€


/ J.L.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-02 Thread Catalin Iacob
On Tue, Feb 2, 2016 at 3:48 PM, Pavel Stehule  wrote:
> If we decided to break compatibility, then we have to do explicitly. Thid
> discussion can continue with commiter, but send path with duplicitly defined
> functions has not sense for me. Currently I out of office, so I cannot to
> clean it. 4.2 I should to work usually

I think I didn't make myself clear so I'll try again.

There are 2 options:

1. keep info etc. unchanged and add raise_info etc.
2. change behaviour of info etc. and of course don't add raise_*

You already implemented 1. I said I think 2. is better and worth the
compatibility break in my opinion. But the committer decides not me.

Since 1. is already done, what I propose is: let's finish the other
aspects of the patch (incorporate my docs updates and details in Error
instead of SPIError) then I'll mark this ready for committer and
summarize the discussion. I will say there that option 1. was
implemented since it's backward compatible but that if the committer
thinks option 2. is better we can change the patch to implement option
2. If you do the work for 2 now, the committer might still say "I want
1" and then you need to do more work again to go back to 1. Easier to
just stay with 1 for now until we have committer input.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add links to commit fests to patch summary page

2016-02-02 Thread Magnus Hagander
On Tue, Feb 2, 2016 at 1:35 PM, Alvaro Herrera 
wrote:

> Magnus Hagander wrote:
> > On Tue, Feb 2, 2016 at 2:46 AM, Jim Nasby 
> wrote:
> >
> > > On 2/1/16 6:15 PM, Alvaro Herrera wrote:
> > >
> > >> Jim Nasby wrote:
> > >>
> > >>> It would be nice if the patch summary page (ie, [1]) had links to the
> > >>> relevant entry in that CF. The specific need I see is if you look up
> a
> > >>> patch
> > >>> in the current CF and it's been moved to the next CF you have to
> > >>> manually go
> > >>> to that CF and search for the patch.
> > >>>
> > >>
> > >> Agreed, I could use that.  In the "status" row, each commitfest entry
> > >> (the "2015-11" text) could be a link to that patch in that commitfest.
> > >>
> > >
> > > Yeah, what I was thinking.
> >
> > Just to be clear, you're looking for the ones that are for the
> non-current
> > one? Because you already have a link to the current commitfest in the
> > breadcrumbs at the top of the page. Or am I misunderstanding completely?
>
> I think you are, because yes I don't care to go to the current
> commitfest (I know how to do that) -- but I don't want to the toplevel
> page for the other commitfest either: what I want is the link to go to
> *that patch's* page in the other commitfest.  That's also what I
> think Jim wants.
>
>
Then i did indeed misunderstand.

So from https://commitfest.postgresql.org/9/353/, you'd want links to
/8/353/, /7/353/, /6/353/?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Add links to commit fests to patch summary page

2016-02-02 Thread Magnus Hagander
On Tue, Feb 2, 2016 at 4:43 PM, Alvaro Herrera 
wrote:

> Magnus Hagander wrote:
>
> > So from https://commitfest.postgresql.org/9/353/, you'd want links to
> > /8/353/, /7/353/, /6/353/?
>
> Right.
>
>
I'm not entirely sure what I'd use that for myself, but that's trivial to
implement. Thus, done and published.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-02 Thread Masahiko Sawada
On Tue, Feb 2, 2016 at 10:05 PM, Alvaro Herrera
 wrote:
> This patch has gotten its fair share of feedback in this fest.  I moved
> it to the next commitfest.  Please do keep working on it and reviewers
> that have additional comments are welcome.

Thanks!

On Tue, Feb 2, 2016 at 8:59 PM, Kyotaro HORIGUCHI
 wrote:
> Since the destination version is fixed, the advantage of the
> plugin mechanism for pg_upgade would be capability to choose a
> plugin to load according to some characteristics of the source
> database. What do you think the trigger characteristics for
> convertLayoutVM_add_frozenbit.so or similars? If it is hard-coded
> like what transfer_single_new_db is doing for fsm and vm, I
> suppose the module is not necessary to be a plugin.

Sorry, I couldn't get it.
You meant that we should use rewriteVisibilityMap as a function (not
dynamically load)?
The destination version is not fixed, it depends on new cluster version.
I'm planning that convertLayoutVM_add_frozenbit.so is dynamically
loaded and used only when rewriting of VM is required.
If layout of VM will be changed again in the future, we could add
other libraries for convert

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Joshua D. Drake

On 02/02/2016 08:13 AM, Michael Banck wrote:

On Tue, Feb 02, 2016 at 07:24:23AM -0800, Joshua D. Drake wrote:

PostgreSQL has auditing. It is available now, just not in core. Postgis
isn't available in core either and it seems to do just fine.


I don't really buy that argument. For one, PostGIS has a pretty narrow
functional use-case (spatial), while auditing is a horizontal use-case
that could be required for any kind of database usage.


The argument was made specifically for the user because they were using 
PostGIS.




Second, PostGIS had 10+ (?) years to build a reputation so that people
say "if I have to choose between PostGIS and buying Oracle Spatial, of
course I choose PostGIS", the pgaudit extension does not have that.


True enough but so what? At some point, someone has to use it. Just 
because it doesn't have 10 years of experience doesn't mean we should 
shove it into core. Those that need it, will use it. My customers (for 
example) use what I tell them to use.



Auditing is a pretty security/enterprisey-related thing that could do
with the "officially considered to of the PostgreSQL project standard
and ready for production" rubber-stamp that tends to go along with most
end-user/admin-oriented stuff shipped in the tarball.


Which is exactly why I think .Org needs an official "Extensions" project 
which would completely eliminate these arguments. A project team 
explicitly for vetting extensions.




I am aware that
2nd Quadrant, Crunchy Data and EnterpriseDB (different codebase via
PPAS) all support their auditing extensions commercially, so that there
is certainly some form of credibility, but still.


Meh, commercial solutions aren't a consideration here. This is 
PostgreSQL not EDB or Crunchy.


Sincerely,

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 11:13 AM, Michael Banck
 wrote:
> On Tue, Feb 02, 2016 at 07:24:23AM -0800, Joshua D. Drake wrote:
>> PostgreSQL has auditing. It is available now, just not in core. Postgis
>> isn't available in core either and it seems to do just fine.
>
> I don't really buy that argument. For one, PostGIS has a pretty narrow
> functional use-case (spatial), while auditing is a horizontal use-case
> that could be required for any kind of database usage.

If you're saying you think the user base for pgaudit will be larger
than the user base for PostGIS, color me doubtful.

> Now, whether or not the currently submitted approach actually meets the
> above rubber-stamp requirements is a different story, but at least I
> think it would be quite useful to ship auditing capabilites in the
> distribution.

That is a sensible position.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Add generate_series(date, date) and generate_series(date, date, integer)

2016-02-02 Thread David Steele
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, failed

Everything looks good except for two minor issues:

1) There should be an informative comment for this struct:

+/* Corey BEGIN */
+typedef struct
+{
+   DateADT current;
+   DateADT stop;
+   int32   step;
+} generate_series_date_fctx;

2) There's an extra linefeed after the struct.  Needed?

Regards,
-David

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add links to commit fests to patch summary page

2016-02-02 Thread Alvaro Herrera
Magnus Hagander wrote:

> So from https://commitfest.postgresql.org/9/353/, you'd want links to
> /8/353/, /7/353/, /6/353/?

Right.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-02-02 Thread Andres Freund
On 2016-02-02 10:12:38 -0500, Robert Haas wrote:
> Here's a sketch of another approach to this problem.  Get rid of the
> relation extension lock.  Instead, have an array of, say, 256 lwlocks.
> Each one protects the extension of relations where hash(relfilenode) %
> 256 maps to that lock.  To extend a relation, grab the corresponding
> lwlock, do the work, then release the lwlock.  You might occasionally
> have a situation where two relations are both being extended very
> quickly and happen to map to the same bucket, but that shouldn't be
> much of a problem in practice, and the way we're doing it presently is
> worse, not better, since two relation extension locks may very easily
> map to the same lock manager partition.

I guess you suspect that the performance problems come from the
heavyweight lock overhead? That's not what I *think* I've seen in
profiles, but it's hard to conclusively judge that.

I kinda doubt that really solves the problem, profiles aside,
though. The above wouldn't really get rid of the extension locks, it
just changes the implementation a bit. We'd still do victim buffer
search, and filesystem operations, while holding an exclusive
lock. Batching can solve some of that, but I think primarily we need
more granular locking, or get rid of locks entirely.


> The only problem with this is that acquiring an LWLock holds off
> interrupts, and we don't want interrupts to be locked out across a
> potentially lengthy I/O.  We could partially fix that if we call
> RESUME_INTERRUPTS() after acquiring the lock and HOLD_INTERRUPTS()
> just before releasing it, but there's still the problem that you might
> block non-interruptibly while somebody else has the lock.  I don't see
> an easy solution to that problem right off-hand, but if something like
> this performs well we can probably conjure up some solution to that
> problem.

Hm. I think to get rid of the HOLD_INTERRUPTS() we'd have to to record
what lock we were waiting on, and in which mode, before going into
PGSemaphoreLock(). Then LWLockReleaseAll() could "hand off" the wakeup
to the next waiter in the queue. Without that we'd sometimes end up with
absorbing a wakeup without then releasing the lock, causing everyone to
block on a released lock.

There's probably two major questions around that: Will it have a
performance impact, and will there be any impact on existing callers?

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 10:49 AM, Andres Freund  wrote:
> I'm doubtful that anything that does the victim buffer search while
> holding the extension lock will actually scale in a wide range of
> scenarios. The copy scenario here probably isn't too bad because the
> copy ring buffes are in use, and because there's no reads increasing the
> usagecount of recent buffers; thus a victim buffers are easily found.

I agree that's an avenue we should try to explore.  I haven't had any
time to think much about how it should be done, but it seems like it
ought to be possible somehow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Michael Banck
On Tue, Feb 02, 2016 at 07:24:23AM -0800, Joshua D. Drake wrote:
> PostgreSQL has auditing. It is available now, just not in core. Postgis
> isn't available in core either and it seems to do just fine.

I don't really buy that argument. For one, PostGIS has a pretty narrow
functional use-case (spatial), while auditing is a horizontal use-case
that could be required for any kind of database usage.

Second, PostGIS had 10+ (?) years to build a reputation so that people
say "if I have to choose between PostGIS and buying Oracle Spatial, of
course I choose PostGIS", the pgaudit extension does not have that.

Auditing is a pretty security/enterprisey-related thing that could do
with the "officially considered to of the PostgreSQL project standard
and ready for production" rubber-stamp that tends to go along with most
end-user/admin-oriented stuff shipped in the tarball.  I am aware that
2nd Quadrant, Crunchy Data and EnterpriseDB (different codebase via
PPAS) all support their auditing extensions commercially, so that there
is certainly some form of credibility, but still.

Now, whether or not the currently submitted approach actually meets the
above rubber-stamp requirements is a different story, but at least I
think it would be quite useful to ship auditing capabilites in the
distribution.


Michael

-- 
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax:  +49 (2161) 4643-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-02 Thread Ashutosh Bapat
On Tue, Feb 2, 2016 at 5:18 AM, Robert Haas  wrote:

> On Mon, Feb 1, 2016 at 8:27 AM, Ashutosh Bapat
>  wrote:
> > Here are patches rebased on recent commit
> > cc592c48c58d9c1920f8e2063756dcbcce79e4dd. Renamed original
> deparseSelectSql
> > as deparseSelectSqlForBaseRel and added deparseSelectSqlForJoinRel to
> > construct SELECT and FROM clauses for base and join relations.
> >
> > pg_fdw_core_v5.patch GetUserMappingById changes
> > pg_fdw_join_v5.patch: postgres_fdw changes for join pushdown including
> > suggestions as described below
> > pg_join_pd_v5.patch: combined patch for ease of testing.
> >
> > The patches also have following changes along with the changes described
> in
> > my last mail.
> > 1. Revised the way targetlists are handled. For a bare base relation the
> > SELECT clause is deparsed from fpinfo->attrs_used but for a base relation
> > which is part of join relation, the expected targetlist is passed down to
> > deparseSelectSqlForBaseRel(). This change removed 75 odd lines in
> > build_tlist_to_deparse() which were very similar to
> > deparseTargetListFromAttrsUsed() in the previous patch.
>
> Nice!
>
> > 2. Refactored postgresGetForeignJoinPaths to be more readable moving the
> > code to assess safety of join pushdown into a separate function.
>
> That looks good.  But maybe call the function foreign_join_ok() or
> something like that?  is_foreign_join() isn't terrible but it sounds a
> little odd to me.
>

I used name is_foreign_join(), which assesses push-down safety of a join,
to have similar naming convention with is_foreign_expr(), which checks
push-down safety of an expression. But foreign_join_ok() is fine too. Used
that.


>
> The path-copying stuff in get_path_for_epq_recheck() looks a lot
> better now, but you neglected to add a comment explaining why you did
> it that way (e.g. "Make a shallow copy of the join path, because the
> planner might free the original structure after a future add_path().
> We don't need to copy the substructure, though; that won't get freed."
>

I alluded to that in the second sentence of comment
3259  * Since we will need to replace any foreign paths for join with their
alternate
3260  * paths, we need make a copy of the local path chosen. Also, that
helps in case
3261  * the planner chooses to throw away the local path.

But that wasn't as clear as your wording. Rewrote the paragraph using your
wording.
3259  * Since we will need to replace any foreign paths for join with their
alternate
3260  * paths, we need make a copy of the local path chosen. Make a shallow
copy of
3261  * the join path, because the planner might free the original
structure after a
3262  * future add_path(). We don't need to copy the substructure, though;
that won't
3263  * get freed.


>  I would forget about setting merge_path->materialize_inner = false;
> that doesn't seem essential.


Done.


> Also, I would arrange things so that if
> you hit an unrecognized path type (like a custom join, or a gather)
> you skip that particular path instead of erroring out.


Ok. Done.


> I think this
> whole function should be moved to core,


I have moved the function to foreign.c where most of the FDW APIs are
located and declared it in fdwapi.h. Since the function deals with the
paths, I thought of adding it to some path related file, but since it's a
helper function that an FDW can use, I thought foreign.c would be better. I
have also added documentation in fdwhandler.sgml. I have renamed the
function as GetPathForEPQRecheck() in order to be consistent with other FDW
APIs. In the description I have just mentioned copy of a local path. I am
not sure whether we should say "shallow copy".


> and I think the argument
> should be a RelOptInfo * rather than a List *.
>

Done.


>
> + * We can't know VERBOSE option is specified or not, so always add
> shcema
>
> We can't know "whether" VERBOSE...
> shcema -> schema
>
>
Done.


> + * the join relaiton is already considered, so that we won't waste
> time in
>
> Typo.
>
>
Done.


> + * judging safety of join pushdow and adding the same paths again if
> found
>
> Typo.
>
> Done.

Sorry for those typos.

Attaching patches with reply to your next mail.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Joshua D. Drake

On 02/02/2016 07:31 AM, curtis.r...@gmail.com wrote:

Its not available in rpm or premade binary forms in its current instantiation, 
not a big deal for me, but it raises the barrier to entry.

If it was packaged into an RPM, what would be the process to get it added to 
postgresql's yum repositories?



Assuming it is not placed into core, we would need to work with the deb 
and rpm projects. Which I am sure we could (and would) do.


Sincerely,
JD


--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Andres Freund
On 2016-02-01 23:16:16 -0500, Noah Misch wrote:
> On Tue, Feb 02, 2016 at 01:13:20AM +0100, Andres Freund wrote:
> > I'm not sure what'd actually be a good upper limit. I'd be inclined to
> > even go to as high as a week or so. A lot of our settings have
> > upper/lower limits that aren't a good idea in general.
> 
> In general, I favor having limits reflect fundamental system limitations
> rather than paternalism.  Therefore, I would allow INT_MAX (68 years).

I generally incree with that attitude - I'm disinclined to go just that
high though. Going close to INT_MAX means having to care about overflow
in trivial computations, in a scenario we're unlikely to ever
test. Sure, we can use a debugger to adjust time or accellerate time
progress, but that's all unrealistic if we're honest.  So maybe go with
a year?

> > I'm also wondering if it'd not make sense to raise the default timeout
> > to 15min or so. The upper ceiling for that really is recovery time, and
> > that has really shrunk rather drastically due to faster cpus and
> > architectural improvements in postgres (bgwriter, separate
> > checkpointer/bgwriter, restartpoints, ...).
> 
> Have those recovery improvements outpaced the increases in max recovery time
> from higher core counts generating more WAL per minute?

Mostly yes, imo. I think the biggest problem with max recovery time is
in workloads that are considerably bigger than shared buffers: There the
single threaded, synchronously reading, startup process (without the
benefit of FPWs filling up pages), has to compete with a lot of
processes having higher IO throughput, because of multiple processes, at
the same time.  But even that has considerably improved due to SSDs.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Andres Freund
On 2016-02-02 11:37:15 +0100, Simon Riggs wrote:
> If people wish to turn off crash recovery, they can already set fsync=off.
> I don't wish to see us support a setting that causes problems for people
> that don't understand what checkpoints are and why everybody needs them.

I don't think fsync=off and very long checkpoints are really
comparable. Many large modern machines, especially with directly
attached storage and/or large amounts of memory, take a *long* while to
boot. So any outage will be dealth with a failover anyway.  But at the
same time, a database in the 10TB+ range can't easily be copied again.
Thus running with fsync=off isn't something that you'd want in those
scenarios - it'd prevent the previous master/other standbys from failing
back/catching up; the databases could be arbitrarily corrupted after
all.

Additionally a significant portion of the cost of checkpoints are full
page writes - you easily can get into the situation where you have
~20MB/sec normal WAL without FPWs, but with them 300MB/s. That rate is
rather expensive, regardless fsync=off.

> The current code needs to act differently with regard to very low settings,
> so when we are a small number of blocks remaining we don't spend hours
> performing them. Allowing very large values would make that even more
> strange.

Why is that a good thing? Every checkpoint triggers a new round of full
page writes. I don't see why you want to accellerate a checkpoint, just
because there's few writes remaining? Yes, the current code partially
behaves that way, but that's imo more an implementation artifact or even
a bug.

> Some systems offer a recovery_time_objective setting that is used to
> control how frequently checkpoints occur. That might be a more usable
> approach.

While desirable, I have no idea to realistically calculate that :(. It's
also a lot bigger than just adjusting a pointlessly low GUC limit.


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2016-02-02 Thread Anastasia Lubennikova



29.01.2016 20:43, Thom Brown:

On 29 January 2016 at 16:50, Anastasia Lubennikova
  wrote:

29.01.2016 19:01, Thom Brown:

On 29 January 2016 at 15:47, Aleksander Alekseev
  wrote:

I tested this patch on x64 and ARM servers for a few hours today. The
only problem I could find is that INSERT works considerably slower after
applying a patch. Beside that everything looks fine - no crashes, tests
pass, memory doesn't seem to leak, etc.

Thank you for testing. I rechecked that, and insertions are really very very
very slow. It seems like a bug.


Okay, now for some badness.  I've restored a database containing 2
tables, one 318MB, another 24kB.  The 318MB table contains 5 million
rows with a sequential id column.  I get a problem if I try to delete
many rows from it:
# delete from contacts where id % 3 != 0 ;
WARNING:  out of shared memory
WARNING:  out of shared memory
WARNING:  out of shared memory

I didn't manage to reproduce this. Thom, could you describe exact steps
to reproduce this issue please?

Sure, I used my pg_rep_test tool to create a primary (pg_rep_test
-r0), which creates an instance with a custom config, which is as
follows:

shared_buffers = 8MB
max_connections = 7
wal_level = 'hot_standby'
cluster_name = 'primary'
max_wal_senders = 3
wal_keep_segments = 6

Then create a pgbench data set (I didn't originally use pgbench, but
you can get the same results with it):

createdb -p 5530 pgbench
pgbench -p 5530 -i -s 100 pgbench

And delete some stuff:

thom@swift:~/Development/test$ psql -p 5530 pgbench
Timing is on.
psql (9.6devel)
Type "help" for help.


   ➤ psql://thom@[local]:5530/pgbench

# DELETE FROM pgbench_accounts WHERE aid % 3 != 0;
WARNING:  out of shared memory
WARNING:  out of shared memory
WARNING:  out of shared memory
WARNING:  out of shared memory
WARNING:  out of shared memory
WARNING:  out of shared memory
WARNING:  out of shared memory
...
WARNING:  out of shared memory
WARNING:  out of shared memory
DELETE 667
Time: 22218.804 ms

There were 358 lines of that warning message.  I don't get these
messages without the patch.

Thom

Thank you for this report.
I tried to reproduce it, but I couldn't. Debug will be much easier now.

I hope I'll fix these issueswithin the next few days.

BTW, I found a dummy mistake, the previous patch contains some unrelated
changes. I fixed it in the new version (attached).

Thanks.  Well I've tested this latest patch, and the warnings are no
longer generated.  However, the index sizes show that the patch
doesn't seem to be doing its job, so I'm wondering if you removed too
much from it.


Huh, this patch seems to be enchanted) It works fine for me. Did you 
perform "make distclean"?

Anyway, I'll send a new version soon.
I just write here to say that I do not disappear and I do remember about 
the issue.
I even almost fixed the insert speed problem. But I'm very very busy 
this week.

I'll send an updated patch next week as soon as possible.

Thank you for attention to this work.

--
Anastasia Lubennikova
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-02 Thread Robert Haas
On Mon, Feb 1, 2016 at 3:47 AM, Alexander Korotkov
 wrote:
> OK. This one looks good for me too.

All right, I pushed both this and the other one as a single commit,
since they were so closely related and the second only one line.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-02 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 2 Feb 2016 20:25:23 +0900, Masahiko Sawada  
wrote in 

Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-02 Thread Alexander Korotkov
On Tue, Feb 2, 2016 at 2:54 PM, Robert Haas  wrote:

> On Mon, Feb 1, 2016 at 3:47 AM, Alexander Korotkov
>  wrote:
> > OK. This one looks good for me too.
>
> All right, I pushed both this and the other one as a single commit,
> since they were so closely related and the second only one line.
>

Great, thanks!
It seems that we have only extension tranches patch pending in this thread.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Andreas Joseph Krogh
På tirsdag 02. februar 2016 kl. 12:04:21, skrev Alvaro Herrera <
alvhe...@2ndquadrant.com >:
Andreas Joseph Krogh wrote:
   
 > Which seems to indicate it has received a fair amount of testing and is 
quite
 > stable.
 > Hopefully it integrates into the 9.6 codebase without too much risk.

 Yes, yes, that's all very good, but we're nearing the closure of the 9.6
 development cycle and we only have one commitfest left.  If someone had
 lots of community brownie points because of doing lots of reviews of
 other people's patches, they might push their luck by posting this patch
 to the final commitfest.  But if that someone didn't, then it wouldn't
 be fair, and if I were the commitfest manager of that commitfest I would
 boot their patch to the 9.7-First commitfest.

 The current commitfest which I'm trying to close still has 24 patches in
 needs-review state and 11 patches ready-for-committer; the next one (not
 closed yet) has 40 patches that will need review.  That means a total of
 75 patches, and those should all be processed ahead of this one.  The
 effort needed to process each of those patches is not trivial, and I'm
 sorry I have to say this but I don't see PostgresPro contributing enough
 reviews, even though I pinged a number of people there, so putting one
 more patch on the rest of the community's shoulders doesn't seem fair to
 me.

 Everybody has their favorite patch that they want in the next release,
 but we only have so much manpower to review and integrate those patches.
 All review help is welcome.
 
I understand completely.
 
-- Andreas Joseph Krogh




Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?

2016-02-02 Thread Alvaro Herrera
Kouhei Kaigai wrote:
> > On 1/31/16 7:38 PM, Kouhei Kaigai wrote:

> > To answer your direct question, I'm no expert, but I haven't seen any
> > functions that do exactly what you want. You'd have to pull relevant
> > bits from ReadBuffer_*. Or maybe a better method would just be to call
> > BufTableLookup() without any locks and if you get a result > -1 just
> > call the relevant ReadBuffer function. Sometimes you'll end up calling
> > ReadBuffer even though the buffer isn't in shared buffers, but I would
> > think that would be a rare occurrence.
> >
> Thanks, indeed, extension can call BufTableLookup(). PrefetchBuffer()
> has a good example for this.
> 
> If it returned a valid buf_id, we have nothing difficult; just call
> ReadBuffer() to pin the buffer.

Isn't this what (or very similar to)
ReadBufferExtended(RBM_ZERO_AND_LOCK) is already doing?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add links to commit fests to patch summary page

2016-02-02 Thread Magnus Hagander
On Tue, Feb 2, 2016 at 2:46 AM, Jim Nasby  wrote:

> On 2/1/16 6:15 PM, Alvaro Herrera wrote:
>
>> Jim Nasby wrote:
>>
>>> It would be nice if the patch summary page (ie, [1]) had links to the
>>> relevant entry in that CF. The specific need I see is if you look up a
>>> patch
>>> in the current CF and it's been moved to the next CF you have to
>>> manually go
>>> to that CF and search for the patch.
>>>
>>
>> Agreed, I could use that.  In the "status" row, each commitfest entry
>> (the "2015-11" text) could be a link to that patch in that commitfest.
>>
>
> Yeah, what I was thinking.


Just to be clear, you're looking for the ones that are for the non-current
one? Because you already have a link to the current commitfest in the
breadcrumbs at the top of the page. Or am I misunderstanding completely?



> (You can actually construct the URL easily just by changing the
>> commitfest ID, which is the first number in the URL; for example 2016-01
>> is /8/).
>>
>
> *waits for someone to comment on how surrogate keys are bad*
>

But they're awesome when the developer is lazy! :) Patches welcome ;)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?

2016-02-02 Thread Kouhei Kaigai
> > > On 1/31/16 7:38 PM, Kouhei Kaigai wrote:
> 
> > > To answer your direct question, I'm no expert, but I haven't seen any
> > > functions that do exactly what you want. You'd have to pull relevant
> > > bits from ReadBuffer_*. Or maybe a better method would just be to call
> > > BufTableLookup() without any locks and if you get a result > -1 just
> > > call the relevant ReadBuffer function. Sometimes you'll end up calling
> > > ReadBuffer even though the buffer isn't in shared buffers, but I would
> > > think that would be a rare occurrence.
> > >
> > Thanks, indeed, extension can call BufTableLookup(). PrefetchBuffer()
> > has a good example for this.
> >
> > If it returned a valid buf_id, we have nothing difficult; just call
> > ReadBuffer() to pin the buffer.
> 
> Isn't this what (or very similar to)
> ReadBufferExtended(RBM_ZERO_AND_LOCK) is already doing?
>
This operation actually acquires a buffer page, fills up with zero
and a valid buffer page is wiped out if no free buffer page.
I want to keep the contents of the shared buffer already loaded on
the main memory. P2P DMA and GPU preprocessing intends to minimize
main memory consumption by rows to be filtered by scan qualifiers.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-02 Thread Amit Kapila
On Tue, Feb 2, 2016 at 5:24 PM, Robert Haas  wrote:
>
> On Mon, Feb 1, 2016 at 3:47 AM, Alexander Korotkov
>  wrote:
> > OK. This one looks good for me too.
>
> All right, I pushed both this and the other one as a single commit,
> since they were so closely related and the second only one line.
>

Thank you!

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Simon Riggs
On 2 February 2016 at 02:05, Curtis Ruck <
curtis.ruck+pgsql.hack...@gmail.com> wrote:


> Just because auditing isn't sexy sharding, parallel partitioning, creative
> indexing (BRIN), or hundreds of thousands of transactions a second, doesn't
> make it any less of a requirement to countless organizations that would
> like to use postgresql, but find the audit requirement a must have.
>
> So, in summary, what would it take to get the core PostgreSQL team to
> actually let auditing patches into the next version?
>

I appreciate your frustration, though I'd say you're making a few
conceptual leaps in what you've said. I can help with a few answers.

For example, 2ndQuadrant developed the original pgAudit extension and
currently provide commercial support for users. So whether this gets
included into core PostgreSQL or not, is not the gating factor on whether
commercial support is available for open source software.

Security is an important thing round here, which also means that we follow
a default-deny approach to new features. So it can take some time to
include new features in core. The process is the same whether its sexy or
not. I agree it can be frustrating at times though overall we maintain a
high throughput of new features into PostgreSQL.

The original version of PgAudit sat in the queue unreviewed for about 7
months, which was a huge factor in it not being accepted into 9.5. We are
very short of reviewers and detailed reviews are accepted from any source.
So yourself or a colleague could make a difference here and I encourage
people with specialist knowledge and passion to take part.

P.S., do you know what sucks, having a highly performant PostGIS database
> that works great, and being told to move to Oracle or SQL Server (because
> they have auditing).  Even though they charge extra for Geospatial support
> (seriously?) or when they don't even have geospatial support (10 years
> ago).  My customer would prefer to re-engineer software designed around
> PostgreSQL and pay the overpriced licenses, than not have auditing.  I
> agree that their cost analysis is probably way off, even 10 years later, my
> only solution would be to move to Oracle, SQL Server, a NoSQL solution, or
> pay EnterpriseDB for their 2 year old version that doesn't have all the
> cool/modern jsonb support.
>

I agree it sucks when other people make money and you don't. That limits
funds available to allocate people on tasks, even when we see them as
important. But there are many companies who would be willing to implement
solutions or extend open source code for you, allowing that problem to be
solved. We don't usually discuss that option here, since this is an
engineering list.

Since you've written the email here, I'd ask that you join our community
and use your knowledge and passion to make things happen.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Alvaro Herrera
Andreas Joseph Krogh wrote:
  
> Which seems to indicate it has received a fair amount of testing and is quite 
> stable.
> Hopefully it integrates into the 9.6 codebase without too much risk.

Yes, yes, that's all very good, but we're nearing the closure of the 9.6
development cycle and we only have one commitfest left.  If someone had
lots of community brownie points because of doing lots of reviews of
other people's patches, they might push their luck by posting this patch
to the final commitfest.  But if that someone didn't, then it wouldn't
be fair, and if I were the commitfest manager of that commitfest I would
boot their patch to the 9.7-First commitfest.

The current commitfest which I'm trying to close still has 24 patches in
needs-review state and 11 patches ready-for-committer; the next one (not
closed yet) has 40 patches that will need review.  That means a total of
75 patches, and those should all be processed ahead of this one.  The
effort needed to process each of those patches is not trivial, and I'm
sorry I have to say this but I don't see PostgresPro contributing enough
reviews, even though I pinged a number of people there, so putting one
more patch on the rest of the community's shoulders doesn't seem fair to
me.

Everybody has their favorite patch that they want in the next release,
but we only have so much manpower to review and integrate those patches.
All review help is welcome.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-02 Thread Masahiko Sawada
On Tue, Feb 2, 2016 at 7:22 PM, Alvaro Herrera  wrote:
> Masahiko Sawada wrote:
>
>> I misunderstood. Sorry for noise.
>> I agree with adding conversion method as a pageConverter routine.
>
> \o/
>
>> This patch doesn't change page layout actually, but pageConverter
>> routine checks only the page layout.
>> And we have to plugin named convertLayout_X_to_Y.
>>
>> I think we have two options.
>>
>> 1. Change page layout(PG_PAGE_LAYOUT_VERSION) to 5. pg_upgrade detects
>> it and then converts only VM files.
>> 2. Change pg_upgrade plugin mechanism so that it can handle other name
>> conversion plugins (e.g., convertLayout_vm_to_vfm)
>>
>> I think #2 is better. Thought?
>
> My vote is for #2 as well.  Maybe we just didn't have forks when this
> functionality was invented; maybe the author just didn't think hard
> enough about what would be the right interface to do it.

Thanks.

I'm planning to change as follows.
- pageCnvCtx struct has new function pointer convertVMFile().
  If the layout of other relation such as FSM, CLOG in the future, we
could add convertFSMFile() and convertCLOGFile().
- Create new library convertLayoutVM_add_frozenbit.c that has
convertVMFile() function which converts only visibilitymap.
  When rewriting of VM is required, convertLayoutVM_add_frozenbit.so
is dynamically loaded.
  convertLayout_X_to_Y converts other relation files.
  That is, converting VM and converting other relations are independently done.
- Current plugin mechanism puts conversion library (*.so) into
${bin}/plugins (i.g., new plugin directory is required), but I'm
thinking to puts it into ${libdir}.

Please give me feedbacks.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2016-02-02 Thread Thom Brown
On 2 February 2016 at 11:47, Anastasia Lubennikova
 wrote:
>
>
> 29.01.2016 20:43, Thom Brown:
>
>> On 29 January 2016 at 16:50, Anastasia Lubennikova
>>   wrote:
>>>
>>> 29.01.2016 19:01, Thom Brown:

 On 29 January 2016 at 15:47, Aleksander Alekseev
   wrote:
>
> I tested this patch on x64 and ARM servers for a few hours today. The
> only problem I could find is that INSERT works considerably slower
> after
> applying a patch. Beside that everything looks fine - no crashes, tests
> pass, memory doesn't seem to leak, etc.
>>>
>>> Thank you for testing. I rechecked that, and insertions are really very
>>> very
>>> very slow. It seems like a bug.
>>>
>> Okay, now for some badness.  I've restored a database containing 2
>> tables, one 318MB, another 24kB.  The 318MB table contains 5 million
>> rows with a sequential id column.  I get a problem if I try to delete
>> many rows from it:
>> # delete from contacts where id % 3 != 0 ;
>> WARNING:  out of shared memory
>> WARNING:  out of shared memory
>> WARNING:  out of shared memory
>
> I didn't manage to reproduce this. Thom, could you describe exact steps
> to reproduce this issue please?

 Sure, I used my pg_rep_test tool to create a primary (pg_rep_test
 -r0), which creates an instance with a custom config, which is as
 follows:

 shared_buffers = 8MB
 max_connections = 7
 wal_level = 'hot_standby'
 cluster_name = 'primary'
 max_wal_senders = 3
 wal_keep_segments = 6

 Then create a pgbench data set (I didn't originally use pgbench, but
 you can get the same results with it):

 createdb -p 5530 pgbench
 pgbench -p 5530 -i -s 100 pgbench

 And delete some stuff:

 thom@swift:~/Development/test$ psql -p 5530 pgbench
 Timing is on.
 psql (9.6devel)
 Type "help" for help.


➤ psql://thom@[local]:5530/pgbench

 # DELETE FROM pgbench_accounts WHERE aid % 3 != 0;
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 ...
 WARNING:  out of shared memory
 WARNING:  out of shared memory
 DELETE 667
 Time: 22218.804 ms

 There were 358 lines of that warning message.  I don't get these
 messages without the patch.

 Thom
>>>
>>> Thank you for this report.
>>> I tried to reproduce it, but I couldn't. Debug will be much easier now.
>>>
>>> I hope I'll fix these issueswithin the next few days.
>>>
>>> BTW, I found a dummy mistake, the previous patch contains some unrelated
>>> changes. I fixed it in the new version (attached).
>>
>> Thanks.  Well I've tested this latest patch, and the warnings are no
>> longer generated.  However, the index sizes show that the patch
>> doesn't seem to be doing its job, so I'm wondering if you removed too
>> much from it.
>
>
> Huh, this patch seems to be enchanted) It works fine for me. Did you perform
> "make distclean"?

Yes.  Just tried it again:

git clean -fd
git stash
make distclean
patch -p1 < ~/Downloads/btree_compression_2.0.patch
../dopg.sh (script I've always used to build with)
pg_ctl start
createdb pgbench
pgbench -i -s 100 pgbench

$ psql pgbench
Timing is on.
psql (9.6devel)
Type "help" for help.


 ➤ psql://thom@[local]:5488/pgbench

# \di+
List of relations
 Schema | Name  | Type  | Owner |  Table   |
Size  | Description
+---+---+---+--++-
 public | pgbench_accounts_pkey | index | thom  | pgbench_accounts | 214 MB |
 public | pgbench_branches_pkey | index | thom  | pgbench_branches | 24 kB  |
 public | pgbench_tellers_pkey  | index | thom  | pgbench_tellers  | 48 kB  |
(3 rows)

Previously, this would show an index size of 87MB for pgbench_accounts_pkey.

> Anyway, I'll send a new version soon.
> I just write here to say that I do not disappear and I do remember about the
> issue.
> I even almost fixed the insert speed problem. But I'm very very busy this
> week.
> I'll send an updated patch next week as soon as possible.

Thanks.

> Thank you for attention to this work.

Thanks for your awesome patches.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tsvector editing functions

2016-02-02 Thread Teodor Sigaev

Some notices:

1 tsin in documentation doesn't look like a good name. Changed to vector similar 
to other places.


2 I did some editorization about freeing memory/forgotten names etc

3 It seems to me that tsvector_unnest() could be seriously optimized for
large tsvectors: with current coding it detoasts/decompresses tsvector value on 
each call. Much better to do it once in

multi_call_memory_ctx context at first call init

4 It seems debatable returning empty array for position/weight if they are 
absent:
=# select * from unnest('a:1 b'::tsvector);
 lexeme | positions | weights
+---+-
 a  | {1}   | {D}
 b  | {}| {}
I think, it's better to return NULL in this case

5 
array_to_tsvector/tsvector_setweight_by_filter/tsvector_delete_arr/tsvector_filter 
doesn't check or pay attention to NULL elements in input arrays






--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2016-02-02 Thread Alvaro Herrera
Andres Freund wrote:

> Could you also measure how this behaves for [...]

While we're proposing benchmark cases -- I remember this being an issue
with toast tables getting very large values of xml which causes multiple
toast pages to be extended for each new value inserted.  If there are
multiple processes inserting these all the time, things get clogged.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tsvector editing functions

2016-02-02 Thread Teodor Sigaev

Some notices:

1 tsin in documentation doesn't look like a good name. Changed to vector similar
to other places.

2 I did some editorization about freeing memory/forgotten names etc


Ooops, forgot to attach
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 139aa2b..9c294e3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9168,16 +9168,29 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
  
   setweight
  
- setweight(tsvector, "char")
+ setweight(vector tsvector, weight "char")
 
 tsvector
-assign weight to each element of tsvector
+assign weight to each element of vector
 setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A')
 'cat':3A 'fat':2A,4A 'rat':5A


 
  
+  setweight
+  setweight by filter
+ 
+ setweight(vector tsvector, weight "char", lexemes "text"[])
+
+tsvector
+assign weight to elements of vector that are listed in lexemes array
+setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}')
+'cat':3A 'fat':2,4 'rat':5A
+   
+   
+
+ 
   strip
  
  strip(tsvector)
@@ -9190,6 +9203,84 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple

 
  
+  delete
+  delete lemexeme
+ 
+ delete(vector tsvector, lexeme text)
+
+tsvector
+remove given lexeme from vector
+delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')
+'cat':3 'rat':5A
+   
+   
+
+ 
+  delete
+  delete lemexemes array
+ 
+ delete(vector tsvector, lexemes text[])
+
+tsvector
+remove any occurrence of lexemes in lexemes array from vector
+delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])
+'cat':3
+   
+   
+
+ 
+  unnest
+ 
+ unnest(tsvector)
+
+setof anyelement
+expand a tsvector to a set of rows. Each row has following columns: lexeme, postings, weights.
+unnest('fat:2,4 cat:3 rat:5A'::tsvector)
+cat{3}{A}
+fat{2,4}  {D,D}
+rat{5}{A}
+(3 rows)
+   
+   
+
+ 
+  to_array
+ 
+ to_array(tsvector)
+
+text[]
+convert tsvector to array of lexemes
+to_array('fat:2,4 cat:3 rat:5A'::tsvector)
+{cat,fat,rat}
+   
+   
+
+ 
+  to_tsvector
+  array to tsvector
+ 
+ to_tsvector(text[])
+
+tsvector
+convert array of lexemes to tsvector
+to_tsvector('{fat,cat,rat}'::text[])
+'fat' 'cat' 'rat'
+   
+   
+
+ 
+  filter
+ 
+ filter(vector tsvector, weights "char"[])
+
+tsvector
+Select only elements with given weights from vector
+filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}')
+'cat':3B 'rat':5A
+   
+   
+
+ 
   to_tsquery
  
  to_tsquery( config regconfig ,  query text)
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index d66b4d5..32033fa 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1326,6 +1326,10 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
 

 
+   
+Full list of tsvector-related functions available in .
+   
+
   
 
   
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index a3f1c361..e7ea270 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -14,6 +14,7 @@
 
 #include "postgres.h"
 
+#include "access/htup_details.h"
 #include "catalog/namespace.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
@@ -65,6 +66,7 @@ typedef struct
 #define STATHDRSIZE (offsetof(TSVectorStat, data))
 
 static Datum tsvector_update_trigger(PG_FUNCTION_ARGS, bool config_column);
+static int tsvector_bsearch(TSVector tsin, char *lexin, int lexin_len);
 
 /*
  * Order: haspos, len, word, for all positions (pos, weight)
@@ -251,6 +253,81 @@ tsvector_setweight(PG_FUNCTION_ARGS)
 	PG_RETURN_POINTER(out);
 }
 
+/*
+ * setweight(tsin tsvector, char_weight "char", lexemes "text"[])
+ *
+ * Assign weight w to elements of tsin that are listed in lexemes.
+ */
+Datum
+tsvector_setweight_by_filter(PG_FUNCTION_ARGS)
+{
+	TSVector	tsin = PG_GETARG_TSVECTOR(0);
+	char		char_weight = PG_GETARG_CHAR(1);
+	ArrayType  *lexemes = PG_GETARG_ARRAYTYPE_P(2);
+
+	TSVector	tsout;
+	int			i,
+j,
+nlexemes,
+

Re: [HACKERS] Relation extension scalability

2016-02-02 Thread Andres Freund
On 2016-01-28 16:53:08 +0530, Dilip Kumar wrote:
> test_script:
> 
> ./psql -d postgres -c "truncate table data"
> ./psql -d postgres -c "checkpoint"
> ./pgbench -f copy_script -T 120 -c$ -j$ postgres
> 
> Shared Buffer48GB
> Table:Unlogged Table
> ench -c$ -j$ -f -M Prepared postgres
> 
> ClientsBasepatch
> 1178   180
> 2337   338
> 4265   601
> 8167   805

Could you also measure how this behaves for an INSERT instead of a COPY
workload? Both throughput and latency. It's quite possible that this
causes latency hikes, because suddenly backends will have to wait for
one other to extend by 50 pages. You'll probably have to use -P 1 or
full statement logging to judge that.  I think just having a number of
connections inserting relatively wide rows into one table should do the
trick.

I'm doubtful that anything that does the victim buffer search while
holding the extension lock will actually scale in a wide range of
scenarios. The copy scenario here probably isn't too bad because the
copy ring buffes are in use, and because there's no reads increasing the
usagecount of recent buffers; thus a victim buffers are easily found.

Thanks,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add links to commit fests to patch summary page

2016-02-02 Thread Jim Nasby

On 2/2/16 6:35 AM, Alvaro Herrera wrote:

what I want is the link to go to
*that patch's*  page in the other commitfest.  That's also what I
think Jim wants.


+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow backslash-continuations in custom scripts

2016-02-02 Thread Robert Haas
On Thu, Jan 7, 2016 at 3:36 AM, Kyotaro HORIGUCHI
 wrote:
> - 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch
>
>   This diff looks a bit large but most of them is cut'n-paste
>   work and the substantial change is rather small.
>
>   This refactors psqlscan.l into two .l files. The additional
>   psqlscan_slash.l is a bit tricky in the point that recreating
>   scan_state on transition between psqlscan.l.

I've looked at this patch a few times now but find it rather hard to
verify.  I am wondering if you would be willing to separate 0001 into
subpatches.  For example, maybe there could be one or two patches that
ONLY move code around and then one or more patches that make the
changes to that code.  Right now, for example, psql_scan_setup() is
getting three additional arguments, but it's also being moved to a
different file.  Perhaps those two things could be done one at a time.

I also think this patch could really benefit from a detailed set of
submission notes that specifically lay out why each change was made
and why.  For instance, I see that psqlscan.l used yyless() while
psqlscanbody.l uses a new my_yyless() you've defined.  There is
probably a great reason for that and I'm sure if I stare at this for
long enough I can figure out what that reason is, but it would be
better if you had a list of bullet points explaining what was changed
and why.

I would really like to see this patch committed; my problem is that I
don't have enough braincells to be sure that it's correct in the
present form.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 11:21 AM, Ashutosh Bapat
 wrote:
>> Why does deparseSelectStmtForRel change the order of the existing
>> arguments?  I have no issue with adding new arguments as required, but
>> rearranging the existing argument order doesn't serve any useful
>> purpose that is immediately apparent.
>
> deparseSelectStmtForRel has two sets of arguments, input and output. They
> are separated in the declaration all inputs come first, followed by all
> outputs. The inputs were ordered according to their appearance in SELECT
> statement, so I added tlist before remote_conds. I should have added
> relations, which is an output argument, at the end, but I accidentally added
> it between existing output arguments. Anyway, I will go ahead and just add
> the new arguments after the existing ones.

No, that's not what I'm asking for, nor do I think it's right.  What
I'm complaining about is that originally params_list was after
retrieved_attrs, but in v5 it's before retrieved_attrs.  I'm fine with
inserting tlist after rel, or in general inserting new arguments in
the sequence.  But you reversed the relative ordering of params_list
and retrieved_attrs.

> I was thinking on the similar lines except rN aliases. I think there will be
> problem for queries like
> postgres=# explain verbose select * from lt left join (select bar.a, foo.b
> from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on
> (lt.b = q.b);
>QUERY PLAN
> 
>  Hash Right Join  (cost=318.03..872.45 rows=43 width=16)
>Output: lt.a, lt.b, bar.a, foo.b
>Hash Cond: (foo.b = lt.b)
>->  Merge Join  (cost=317.01..839.07 rows=8513 width=8)
>  Output: bar.a, foo.b
>  Merge Cond: (bar.a = foo.a)
>  Join Filter: ((bar.b + foo.b) < 10)
>  ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>Output: bar.a, bar.b
>Sort Key: bar.a
>->  Seq Scan on public.bar  (cost=0.00..32.60 rows=2260
> width=8)
>  Output: bar.a, bar.b
>  ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>Output: foo.b, foo.a
>Sort Key: foo.a
>->  Seq Scan on public.foo  (cost=0.00..32.60 rows=2260
> width=8)
>  Output: foo.b, foo.a
>->  Hash  (cost=1.01..1.01 rows=1 width=8)
>  Output: lt.a, lt.b
>  ->  Seq Scan on public.lt  (cost=0.00..1.01 rows=1 width=8)
>Output: lt.a, lt.b
> (21 rows)
>
> The subquery q is pulled up, so there won't be trace of q in the join tree
> except may be a useless RTE for the subquery. There will be RelOptInfo
> representing join between lt, bar and foo and a RelOptInfo for join between
> bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before
> joining (bar, foo) with lt and should go with bar left join foo. But the
> syntax doesn't support something like "bar left join foo on (bar.a = foo.a)
> where bar.b + foo.b". So we will have to construct a SELECT statement for
> this join and add to the FROM clause with a subquery alias and then refer
> the columns of foo and bar with that subquery alias.

Hmm, does it work if we put bar.b + foo.b < 10 in the ON clause for
the join between lt and foo/bar? I think so...

> Further during the process of qual placement, quals that can be evaluated at
> the level of given relation in the join tree are attached to that relation
> if they can be pushed down. Thus if we see a qual attached to a given
> relation, AFAIU, we can not say whether it needs to be evaluated there
> (similar to above query) or planner pushed it down for optimization, and
> thus for every join relation with quals we will need to build subqueries
> with aliases.

I don't think that's true.  I theorize that every qual can either go
into the top level WHERE clause or the ON clause of some join.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2016-02-02 Thread Robert Haas
On Mon, Feb 1, 2016 at 8:40 AM, Alexander Korotkov  wrote:
> I wonder if we can use 4-byte wait_event_info more efficiently.
> LWLock number in the tranche would be also useful information to expose.
> Using lwlock number user can determine if there is high concurrency for
> single lwlock in tranche or it is spread over multiple lwlocks.
> I think it would be enough to have 6 bits for event class id and 10 bit for
> event id. So, it would be maximum 64 event classes and maximum 1024 events
> per class. These limits seem to be fair enough for me.
> And then we save 16 bits for lock number. It's certainly not enough for some
> tranches. For instance, number of buffers could be easily more than 2^16.
> However, we could expose at least lower 16 bits. It would be at least
> something. Using this information user at least can make a conclusion like
> "It MIGHT be a high concurrency for single buffer content. Other way it is
> coincidence that a lot of different buffers have the same 16 lower bits.".
>
> Any thoughts?

Meh.  I think that's trying to be too clever.  That seems hard to
document, hard to explain, and likely incomprehensible to users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: index-only scans with partial indexes

2016-02-02 Thread Konstantin Knizhnik
I have applied this patch to our working branch and  during several 
weeks we ran various tests and benchmarks.

We have not noticed any problems or performance degradation.
And at some queries this patch cause very significant increase of 
performance - ten times:


With this patch:

postgres=# explain analyze select count(*) from t where k1<100 and 
pk < 1454434742881892;

   QUERY PLAN

 Aggregate  (cost=29.65..29.66 rows=1 width=0) (actual 
time=0.108..0.108 rows=1 loops=1)
   ->  Index Only Scan using idx1 on t  (cost=0.43..27.49 rows=861 
width=0) (actual time=0.012..0.071 rows=963 loops=1)

 Index Cond: (k1 < 100)
 Heap Fetches: 0
 Planning time: 0.100 ms
 Execution time: 0.121 ms
(6 rows)


Without patch:

postgres=# explain analyze select count(*) from t where k1<100 and 
pk < 1454434742881892;

   QUERY PLAN

 Aggregate  (cost=2951.55..2951.56 rows=1 width=0) (actual 
time=1.070..1.070 rows=1 loops=1)
   ->  Bitmap Heap Scan on t  (cost=19.10..2949.40 rows=861 width=0) 
(actual time=0.161..0.997 rows=963 loops=1)
 Recheck Cond: ((k1 < 100) AND (pk < 
'1454434742881892'::bigint))

 Heap Blocks: exact=954
 ->  Bitmap Index Scan on idx1  (cost=0.00..18.88 rows=861 
width=0) (actual time=0.083..0.083 rows=963 loops=1)

   Index Cond: (k1 < 100)
 Planning time: 0.099 ms
 Execution time: 1.089 ms
(8 rows)




On 01.02.2016 01:11, Alvaro Herrera wrote:

Konstantin Knizhnik wrote:

I am very interested in this patch because it allows to use partial indexes to 
... speed up inserts.
I have implemented "ALTER INDEX ... WHERE ..." construction which allows to 
change predicate of partial index without necessity to fully rebuild it.
So it is not necessary to insert new records in index immediately (if new 
records do not match partial index conditions).
It can be done later in background (or at night). My experiments show that it 
allows to increase insert speed five times (for either partial indexes).
At the same time we do not loose RDBMS requirement that result of query should 
not depend on presence of indexes. And it is applicable to all indexes: B-Tree, 
GIN, GIST,...

But such optimization makes sense only of partial indexes can be used without 
extra overhead, first of all for index-only scans.
And it is impossible without this patch.

That sounds interesting.  So please review this patch and let us know
whether you like it, or whether you have any better ideas for any
particular hunk, or whether you think it should be rewritten from
scratch, or just state that it is perfect.  If you think it's useful,
then it's a good idea to vouch for it to be integrated; and one way of
doing that is making sure it meets the quality standards etc.  If you
don't say anything about the patch, we may never integrate it because we
might have doubts about whether it's worthy.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add links to commit fests to patch summary page

2016-02-02 Thread Alvaro Herrera
Magnus Hagander wrote:
> On Tue, Feb 2, 2016 at 4:43 PM, Alvaro Herrera 
> wrote:
> 
> > Magnus Hagander wrote:
> >
> > > So from https://commitfest.postgresql.org/9/353/, you'd want links to
> > > /8/353/, /7/353/, /6/353/?
> >
> > Right.

> I'm not entirely sure what I'd use that for myself, but that's trivial to
> implement. Thus, done and published.

Works nicely for me, many thanks there.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Curtis Ruck
Robert,

This isn't wrong.  I don't see anyone else trying to submit anything in
reference to auditing.  Just because there have been multiple
implementations in the past, based on commit histories, there have only
been a few that tried getting into core or contrib (that i can find in
mailing list history).  Currently, in 9.6, there is one version that is
trying to make it in.  If Crunchy, or 2nd Quadrant, tried to get their
versions incorporated we would have a disagreement in implementation, but
either they are lying in wait, or they passively concur, by not actively
disagreeing.

I think if there was a valid commit path laid out for getting auditing into
core, or into contrib at least, most users would probably find that
sufficient.  But it seems that every time someone tries to incorporate
auditing, there are countless and varied reasons to deny its inclusion.

David Steele's version of auditing builds on most of the prior pgaudit code
and incorporates a large amount of the feedback from 9.5's attempt.

I'm opening to testing and evaluating to see if it meets our compliance
requirements, but I am no where close to being a C developer, or having C
developers that could actually provide a meaningful review.  One issue
along this thread already pops up, concerning the client_min_messages, and
how other patches in the pipeline for 9.6 would be required to enable the
auditing to meet compliance requirements.

It just seems after reading the mailing list history, that there is a lack
of interest by people with commit authority, even though there is a decent
interest in it from the community, and honestly, no one really likes
auditing, but its one of those damned if you do (in performance) and damned
if you don't (in legal) things.

Additionally Robert, given your professional status, you are by no means an
unbiased contributor in this discussion.  Your stance on this matter shows
that you don't necessarily want the open source solution to succeed in the
commercial/compliance required space.  Instead of arguing blankly against
inclusion can you at least provide actionable based feedback that if met
would allow patches of this magnitude in?

I'm personally fine with fiscally rewarding organizations that assist my
customer in succeeding, but its hard to convince my customer to fund open
source, even though they wouldn't be able to do 75% of what they do without
it.  Based on past experience this is the same most open source
organizations face, especially when they don't have the marketing engine
that the large commercial players have.

Curtis

…

On Tue, Feb 2, 2016 at 5:23 PM Robert Haas  wrote:

> On Tue, Feb 2, 2016 at 5:16 PM, David Steele  wrote:
> > This sort of confusion is one of the main reasons I pursued inclusion in
> > core.
>
> But that's exactly wrong.  When there is not agreement on one code
> base over another, that's the time it is most important not to pick
> one of them arbitrarily privilege it over the others.  The *only* time
> it's appropriate to move something that could just as well as an
> extension into core is when (1) we think it's highly likely that users
> will want that particular thing rather than some other thing and (2)
> we think it's worth the burden of maintaining that code forever.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] PostgreSQL Auditing

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 8:25 PM, Curtis Ruck
 wrote:
> Additionally Robert, given your professional status, you are by no means an
> unbiased contributor in this discussion.  Your stance on this matter shows
> that you don't necessarily want the open source solution to succeed in the
> commercial/compliance required space.  Instead of arguing blankly against
> inclusion can you at least provide actionable based feedback that if met
> would allow patches of this magnitude in?

I feel the need to respond to this.  I don't appreciate being called a
liar.  I do not block patches because having them included in
PostgreSQL would be to the detriment of my employer.  Ever.  That
would be dishonest and a betrayal of the community's trust.  Full
stop.  I have a track record of committing multiple large patches that
were developed by people working for EnterpriseDB's competitors (like
logical decoding) or that competed with proprietary features
EnterpriseDB already had (like foreign tables).  I spend countless
hours working on countless patches written by people who do not work
for, and have no commercial relationship with, my employer: and who
are sometimes working for a competitor.  I have worked hard to ensure
that EnterpriseDB makes major contributions to the PostgreSQL
community, such as parallel query.

Even if all of the above were not true, EnterpriseDB does not
currently have a feature that competes with pgaudit, and has no
current plans to try to develop one.  EDBAS does have an auditing
feature, but that feature is radically different from what pgaudit
does; arguing that I am trying to block pgaudit from going into core
because that feature exists is like arguing that I don't want
PostgreSQL to get a new frying pan because EnterpriseDB has a toy
boat.  Furthermore, to the extent that EnterpriseDB does have an
interest in having a feature like pgaudit, it would be to my advantage
for that feature to go *into* core.  After all, everything in
PostgreSQL is in EDBAS, but things on PGXN generally aren't.  In
short, your accusations are both  false and illogical.

I'm going to go ahead and make a suggestion: instead of showing up on
this mailing list and accusing the people who spend their time and
energy here trying to make PostgreSQL a better of being pigheaded
liars, I think that you should try to really understand how this
community works, how it makes decisions, what it does well, and what
it does poorly.  Then, I think you should argue for your positions in
a respectful way, carefully avoiding accusations of bad faith even
(and perhaps especially) in cases where you believe it may be present.
You will find that almost everyone here behaves in that way, and that
is what enables us to get along as well as we do and create a great
piece of software together.  Every single person who has responded to
your emails - and there have been a bunch - has done so with courtesy
and integrity, and yet you seem convinced (without a shred of
evidence, at least that you've presented) that anyone who doesn't
think pgaudit should go into core is either an idiot or part of some
sort of cabal.  Yet, if that were really true, there would be little
point in arguing, because the cabalists won't listen to you anyway,
and the idiots will make stupid decisions no matter what.  Perhaps you
should try starting from a different premise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2016-02-02 Thread Amit Kapila
On Tue, Feb 2, 2016 at 10:09 PM, Robert Haas  wrote:
>
> On Mon, Feb 1, 2016 at 8:40 AM, Alexander Korotkov 
wrote:
> > I wonder if we can use 4-byte wait_event_info more efficiently.
> > LWLock number in the tranche would be also useful information to expose.
> > Using lwlock number user can determine if there is high concurrency for
> > single lwlock in tranche or it is spread over multiple lwlocks.
> > I think it would be enough to have 6 bits for event class id and 10 bit
for
> > event id. So, it would be maximum 64 event classes and maximum 1024
events
> > per class. These limits seem to be fair enough for me.
> > And then we save 16 bits for lock number. It's certainly not enough for
some
> > tranches. For instance, number of buffers could be easily more than
2^16.
> > However, we could expose at least lower 16 bits. It would be at least
> > something. Using this information user at least can make a conclusion
like
> > "It MIGHT be a high concurrency for single buffer content. Other way it
is
> > coincidence that a lot of different buffers have the same 16 lower
bits.".
> >
> > Any thoughts?
>
> Meh.  I think that's trying to be too clever.  That seems hard to
> document, hard to explain, and likely incomprehensible to users.
>

So, let's leave adding any additional column, but Alexander has brought up
a good point about storing the wait_type and actual wait_event
information into four bytes.  Currently I have stored wait_type (aka
classId)
in first byte and then two bytes for wait_event (eventId) and remaining
one-byte can be used in future if required, however Alexandar is proposing
to
combine both these (classId and eventId) into two-bytes which sounds
reasonable to me apart from the fact that it might add operation or two
extra
in this path.  Do you or anyone else have any preference over this point?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 10:27 PM, Amit Kapila  wrote:
> So, let's leave adding any additional column, but Alexander has brought up
> a good point about storing the wait_type and actual wait_event
> information into four bytes.  Currently I have stored wait_type (aka
> classId)
> in first byte and then two bytes for wait_event (eventId) and remaining
> one-byte can be used in future if required, however Alexandar is proposing
> to
> combine both these (classId and eventId) into two-bytes which sounds
> reasonable to me apart from the fact that it might add operation or two
> extra
> in this path.  Do you or anyone else have any preference over this point?

I wouldn't bother tinkering with it at this point.  The value isn't
going to be recorded on disk anywhere, so it will be easy to change
the way it's computed in the future if we ever need to do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 8:09 PM, Noah Misch  wrote:
> On Tue, Feb 02, 2016 at 12:24:50PM +0100, Andres Freund wrote:
>> On 2016-02-01 23:16:16 -0500, Noah Misch wrote:
>> > On Tue, Feb 02, 2016 at 01:13:20AM +0100, Andres Freund wrote:
>> > > I'm not sure what'd actually be a good upper limit. I'd be inclined to
>> > > even go to as high as a week or so. A lot of our settings have
>> > > upper/lower limits that aren't a good idea in general.
>> >
>> > In general, I favor having limits reflect fundamental system limitations
>> > rather than paternalism.  Therefore, I would allow INT_MAX (68 years).
>>
>> I generally incree with that attitude - I'm disinclined to go just that
>> high though. Going close to INT_MAX means having to care about overflow
>> in trivial computations, in a scenario we're unlikely to ever
>> test. Sure, we can use a debugger to adjust time or accellerate time
>> progress, but that's all unrealistic if we're honest.  So maybe go with
>> a year?
>
> Okay.

Sounds good to me, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pglogical - logical replication contrib module

2016-02-02 Thread Steve Singer

On 01/26/2016 10:43 AM, Craig Ringer wrote:
On 23 January 2016 at 11:17, Steve Singer > wrote:





** Schema changes involving rewriting big tables

Sometimes you have a DDL change on a large table that will involve
a table rewrite and the best way of deploying the change is to
make the DDL change
on a replicate then once it is finished promote the replica to the
origin in some controlled fashion.  This avoids having to lock the
table on the origin for hours.

pglogical seems to allow minor schema changes on the replica such
as changing a type but it doesn't seem to allow a DO INSTEAD
trigger on the replica.  I don't think pglogical currently meets
this use case particularly well


I'm not sure I fully understand that one.


Say you have a table A with column b

and the next version of your application you want to create a second 
table B that has column B


create table B (b text);
insert into B select b from a;
alter table a drop column b;

but you want to do this on a replica because it is a very big table and 
you want to minimize downtown.


You could have a trigger on the replica that performed updates on B.b 
instead of A except triggers don't seem to get run on the replica.




 Craig Ringer http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] FETCH limited by bytes.

2016-02-02 Thread Corey Huinker
>
>
> I don't see how.  There really is no declaration in there for a
> variable called server.
>

Absolutely correct. My only guess is that it was from failing to make clean
after a checkout/re-checkout. A good reason to have even boring regression
tests.

Regression tests added.
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 2390e61..e28cf77 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3951,3 +3951,63 @@ QUERY:  CREATE FOREIGN TABLE t5 (
 OPTIONS (schema_name 'import_source', table_name 't5');
 CONTEXT:  importing foreign table "t5"
 ROLLBACK;
+BEGIN;
+CREATE SERVER fetch101 FOREIGN DATA WRAPPER postgres_fdw OPTIONS( fetch_size '101' );
+SELECT count(*)
+FROM pg_foreign_server
+WHERE srvname = 'fetch101'
+AND srvoptions @> array['fetch_size=101'];
+ count 
+---
+ 1
+(1 row)
+
+ALTER SERVER fetch101 OPTIONS( SET fetch_size '202' );
+SELECT count(*)
+FROM pg_foreign_server
+WHERE srvname = 'fetch101'
+AND srvoptions @> array['fetch_size=101'];
+ count 
+---
+ 0
+(1 row)
+
+SELECT count(*)
+FROM pg_foreign_server
+WHERE srvname = 'fetch101'
+AND srvoptions @> array['fetch_size=202'];
+ count 
+---
+ 1
+(1 row)
+
+CREATE FOREIGN TABLE table3 ( x int ) SERVER fetch101 OPTIONS ( fetch_size '3' );
+SELECT COUNT(*)
+FROM pg_foreign_table
+WHERE ftrelid = 'table3'::regclass
+AND ftoptions @> array['fetch_size=3'];
+ count 
+---
+ 1
+(1 row)
+
+ALTER FOREIGN TABLE table3 OPTIONS ( SET fetch_size '6');
+SELECT COUNT(*)
+FROM pg_foreign_table
+WHERE ftrelid = 'table3'::regclass
+AND ftoptions @> array['fetch_size=3'];
+ count 
+---
+ 0
+(1 row)
+
+SELECT COUNT(*)
+FROM pg_foreign_table
+WHERE ftrelid = 'table3'::regclass
+AND ftoptions @> array['fetch_size=6'];
+ count 
+---
+ 1
+(1 row)
+
+ROLLBACK;
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 86a5789..f89de2f 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -131,6 +131,17 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
 			/* check list syntax, warn about uninstalled extensions */
 			(void) ExtractExtensionList(defGetString(def), true);
 		}
+		else if (strcmp(def->defname, "fetch_size") == 0)
+		{
+			int		fetch_size;
+
+			fetch_size = strtol(defGetString(def), NULL,10);
+			if (fetch_size <= 0)
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg("%s requires a non-negative integer value",
+def->defname)));
+		}
 	}
 
 	PG_RETURN_VOID();
@@ -162,6 +173,9 @@ InitPgFdwOptions(void)
 		/* updatable is available on both server and table */
 		{"updatable", ForeignServerRelationId, false},
 		{"updatable", ForeignTableRelationId, false},
+		/* fetch_size is available on both server and table */
+		{"fetch_size", ForeignServerRelationId, false},
+		{"fetch_size", ForeignTableRelationId, false},
 		{NULL, InvalidOid, false}
 	};
 
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 2ab85f6..8f72ff7 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -68,7 +68,9 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer representing the desired fetch_size */
+	FdwScanPrivateFetchSize
 };
 
 /*
@@ -126,6 +128,8 @@ typedef struct PgFdwScanState
 	/* working memory contexts */
 	MemoryContext batch_cxt;	/* context holding current batch of tuples */
 	MemoryContext temp_cxt;		/* context for per-tuple temporary data */
+
+	int			fetch_size;		/* number of tuples per fetch */
 } PgFdwScanState;
 
 /*
@@ -380,6 +384,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	fpinfo->fdw_startup_cost = DEFAULT_FDW_STARTUP_COST;
 	fpinfo->fdw_tuple_cost = DEFAULT_FDW_TUPLE_COST;
 	fpinfo->shippable_extensions = NIL;
+	fpinfo->fetch_size = 100;
 
 	foreach(lc, fpinfo->server->options)
 	{
@@ -394,16 +399,17 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		else if (strcmp(def->defname, "extensions") == 0)
 			fpinfo->shippable_extensions =
 ExtractExtensionList(defGetString(def), false);
+		else if (strcmp(def->defname, "fetch_size") == 0)
+			fpinfo->fetch_size = strtol(defGetString(def), NULL,10);
 	}
 	foreach(lc, fpinfo->table->options)
 	{
 		DefElem*def = (DefElem *) lfirst(lc);
 
 		if (strcmp(def->defname, "use_remote_estimate") == 0)
-		{
 			fpinfo->use_remote_estimate = defGetBoolean(def);
-			break;/* only need the one value */
-		}
+		else if (strcmp(def->defname, "fetch_size") == 0)
+			fpinfo->fetch_size = strtol(defGetString(def), NULL,10);
 	}
 
 	/*
@@ -1012,6 +1018,9 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 */
 	fdw_private = 

Re: [HACKERS] Minor code improvements to create_foreignscan_plan/ExecInitForeignScan

2016-02-02 Thread Etsuro Fujita

On 2016/02/03 3:31, Alvaro Herrera wrote:

Etsuro Fujita wrote:


Done.  Attached is an updated version of the patch.


Pushed, thanks.


Thank you!


I kinda wonder why this struct member has a name that doesn't match the
naming convention in the rest of the struct, and also why isn't it
documented in the comment above the struct definition.  But that's not
this patch's fault.


I think so, too.

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Incorrect formula for SysV IPC parameters

2016-02-02 Thread Kyotaro HORIGUCHI
Hello, I found that the formulas to calculate SEMMNI and SEMMNS
are incorrect in 9.2 and later.

http://www.postgresql.org/docs/9.5/static/kernel-resources.html

All of them say that the same thing as following,

| SEMMNI  Maximum number of semaphore identifiers (i.e., sets)
| 
|   at least ceil((max_connections + autovacuum_max_workers + 4) / 16)
| 
| SEMMNSMaximum number of semaphores system-wide
| 
|  ceil((max_connections + autovacuum_max_workers + 4) / 16) * 17
|plus room for other applications

But actually the number of semaphores PostgreSQL needs is
calculated as following in 9.4 and later.

  numSemas = MaxConnections + NUM_AUXILIARY_PROCS(=4)
  MaxConnections = max_connections + autovacuum_max_workers + 1 +
   max_worker_processes

So, the formula for SEMMNI should be

ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16)

and SEMMNS should have the same fix.


In 9.3 and 9.2, the documentation says the same thing but
actually it is calculated as following,

  numSemas = MaxConnections + NUM_AUXILIARY_PROCS(=4)
  MaxConnections = max_connections + autovacuum_max_workers + 1 +
   GetNumShmemAttachedBgworkers()

Omitting GetNumShmemAttachedBgworkers, the actual formula is

ceil((max_connections + autovacuum_max_workers + 5) / 16)


In 9.1, NUM_AUXILIARY_PROCS is 3 so the documentations is correct.


I attached two patches for 9.2-9.3 and 9.4-9.6dev
respectively. patch command complains a bit on applying it on
9.2.

On the platforforms that doesn't have tas operation needs
additional 1024 + 64 semaphores but I understand it as out of
scope of the documentation.

One concern is that 'at least' and 'plus room for other
applications' are mixed in the table 17-1, especially for SEMMNI
and SEMMNS. It seems to me that they should be in the same
wording, but it is not an actual problem.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

>From 6da5ad413dff4724fee75f1ba09013b6033f76ca Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 3 Feb 2016 11:35:43 +0900
Subject: [PATCH] Fix the formula to calculate SEMMNI and SEMMNS in
 documentation

---
 doc/src/sgml/runtime.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 0db3807..45579da 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -645,13 +645,13 @@ psql: could not connect to server: No such file or directory

 SEMMNI
 Maximum number of semaphore identifiers (i.e., sets)
-at least ceil((max_connections + autovacuum_max_workers + 4) / 16)
+at least ceil((max_connections + autovacuum_max_workers + 5) / 16)

 

 SEMMNS
 Maximum number of semaphores system-wide
-ceil((max_connections + autovacuum_max_workers + 4) / 16) * 17 plus room for other applications
+ceil((max_connections + autovacuum_max_workers + 5) / 16) * 17 plus room for other applications

 

-- 
1.8.3.1

>From aaf51a50e942edfa35af20a8a1d8b8719155664b Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 3 Feb 2016 11:19:58 +0900
Subject: [PATCH] Fix the formula to calculate SEMMNI and SEMMNS in
 documentation

---
 doc/src/sgml/runtime.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 650d455..4b2e403 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -645,13 +645,13 @@ psql: could not connect to server: No such file or directory

 SEMMNI
 Maximum number of semaphore identifiers (i.e., sets)
-at least ceil((max_connections + autovacuum_max_workers + 4) / 16)
+at least ceil((max_connections + autovacuum_max_workers max_worker_processes + 5) / 16)

 

 SEMMNS
 Maximum number of semaphores system-wide
-ceil((max_connections + autovacuum_max_workers + 4) / 16) * 17 plus room for other applications
+ceil((max_connections + autovacuum_max_workers + max_worker_processes + 5) / 16) * 17 plus room for other applications

 

-- 
1.8.3.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raising the checkpoint_timeout limit

2016-02-02 Thread Tom Lane
Noah Misch  writes:
> On Tue, Feb 02, 2016 at 12:24:50PM +0100, Andres Freund wrote:
>> On 2016-02-01 23:16:16 -0500, Noah Misch wrote:
>>> In general, I favor having limits reflect fundamental system limitations
>>> rather than paternalism.  Therefore, I would allow INT_MAX (68 years).

>> I generally incree with that attitude - I'm disinclined to go just that
>> high though. Going close to INT_MAX means having to care about overflow
>> in trivial computations, in a scenario we're unlikely to ever
>> test. Sure, we can use a debugger to adjust time or accellerate time
>> progress, but that's all unrealistic if we're honest.  So maybe go with
>> a year?

> Okay.

I've gotta go with the "paternalism" side of the argument here.  Suppose
you configure your system to checkpoint once a year --- what is going to
happen when the year is up?  Or when you try to shut it down?  You *will*
regret such a setting.

I don't think we should allow the checkpoint distances to be so large that
checkpoints don't happen in the normal course of events.  I'd be okay with
the max being a day, perhaps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Client Log Output Filtering

2016-02-02 Thread Robert Haas
On Mon, Feb 1, 2016 at 7:24 PM, David Steele  wrote:
> On 2/1/16 5:25 PM, Alvaro Herrera wrote:
>> David Steele wrote:
>
>>> 2) There would be two different ways to suppress client messages but I was
>>> hoping to only have one.
>>
>> I think they are two different things actually.
>
> Fair enough - that was my initial reaction as well but then I thought
> the other way would be better.
>
>> I'm closing this as returned with feedback.
>
> I have attached a patch that adds an ereport() macro to suppress client
> output for a single report call (applies cleanly on 1d0c3b3).  I'll also
> move it to the next CF.

I don't see any reason not to accept this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2016-02-02 Thread Jim Nasby

On 2/2/16 4:52 PM, Alvaro Herrera wrote:

Robert Haas wrote:


The eventual committer is likely to be much happier with this patch if
you guys have achieved consensus among yourselves on the best
approach.

(Disclaimer: The eventual committer won't be me.  I'm not a Python
guy.  But we try to proceed by consensus rather than committer-dictat
around here, when we can.  Obviously the committer has the final say
at some level, but it's better if that power doesn't need to be
exercised too often.)


Actually I imagine that if there's no agreement between author and first
reviewer, there might not *be* a committer in the first place.  Perhaps
try to get someone else to think about it and make a decision.  It is
possible that some other committer is able to decide by themselves but I
wouldn't count on it.


+1.

FWIW, I'd think it's better to not break backwards compatibility, but 
I'm also far from a python expert. It might well be worth adding a 
plpython GUC to control the behavior so that there's a migration path 
forward, or maybe do something like the 'import __future__' that python 
is doing to ease migration to python 3.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Integer overflow in timestamp_part()

2016-02-02 Thread Vitaly Burovoy
On 2/2/16, Tom Lane  wrote:
> [ Please use a useful Subject: line in your posts. ]

I'm so sorry, it was the first time I had forgotten to look at the
"Subject" field before I pressed the "Send" button.

> Vitaly Burovoy  writes:
>> I've just found a little bug: extracting "epoch" from the last 30
>> years before Postgres' "+Infinity" leads an integer overflow:
>
> Hmm.  I do not like the proposed patch much: it looks like it's
> throwing away precision too soon, although given that the result of
> SetEpochTimestamp can be cast to float exactly, maybe it doesn't matter.
>
> More importantly, I seriously doubt that this is the only issue
> for timestamps very close to the INT64_MAX boundary.  An example is
> that we're not rejecting values that would correspond to DT_NOBEGIN
> or DT_NOEND:
>
> regression=# set timezone = 'PST8PDT';
> SET
> regression=# select '294277-01-08 20:00:54.775806-08'::timestamptz;
>timestamptz
> -
>  294277-01-08 20:00:54.775806-08
> (1 row)
>
> regression=# select '294277-01-08 20:00:54.775807-08'::timestamptz;
>  timestamptz
> -
>  infinity
> (1 row)
>
> regression=# select '294277-01-08 20:00:54.775808-08'::timestamptz;
>  timestamptz
> -
>  -infinity
> (1 row)
>
> regression=# select '294277-01-08 20:00:54.775809-08'::timestamptz;
> ERROR:  timestamp out of range
>
> Worse yet, that last error is coming from timestamptz_out, not
> timestamptz_in; we're accepting a value we cannot store properly.
> The converted value has actually overflowed to be equal to
> INT64_MIN+1, and then timestamptz_out barfs because it's before
> Julian day 0.  Other operations would incorrectly interpret it
> as a date in the very far past.  timestamptz_in doesn't throw an
> error until several hours later than this; it looks like the
> problem is that tm2timestamp() worries about overflow in initially
> calculating the converted value, but not about overflow in the
> dt2local() rotation, and in any case it doesn't worry about not
> producing DT_NOEND.

It is clear why it happens, and it was in my plans to insert checks
there according to the thread[1].

> I'm inclined to think that a good solution would be to create an
> artificial restriction to not accept years beyond, say, 10 AD.

Well... We can limit it to the boundaries described at the
documentation page[2]: [4713 BC, 294276 AD].
It allows us be sure we will not break stamps that are stored (for any
reason) according to the documentation (in meaning of infinity, but
not exactly as 'infinity'::timestamptz).

Currently boundaries for timestamp[tz] are [4714-11-24+00 BC,
294277-01-09 04:00:54.775806+00]. One month to the lower boundary and
9 days to the upper one should be enough to represent it into int64
before applying time zone (+-15 hours) and check for boundaries
without an overflow.

> That would leave us with a lot of daylight to not have to worry
> about corner-case overflows in timestamp arithmetic.

Great! It was my next question because I desperated to find a solution
for finding a good corner-case for internal version of the
"to_timestamp" function (my not published yet WIP patch) that supports
+-Infinity::float8 as input to be symmertric with current
"extract('epoch'..."[3].

The exact value for current allowed maximal value ("294277-01-09
04:00:54.775806+00") should be 9224318721654.775806, but it cannot be
represented as float8. My experiments show there is "big" gap in
miliseconds (~0.002, but not 0.01):

 src  |representation
--+
 9224318721654.774414 | 9224318721654.7734375
 9224318721654.774415 | 9224318721654.775390625
 9224318721654.776367 | 9224318721654.775390625
 9224318721654.776368 | 9224318721654.77734375
 9224318721654.778320 | 9224318721654.77734375
 9224318721654.778321 | 9224318721654.779296875

So if it is possible to set an upper limit exact by a year boundary it
solves a lot of nerves.

> I'm not sure
> though where we'd need to enforce such a restriction; certainly in
> timestamp[tz]_in, but where else?
>
>   regards, tom lane

What to do with dates: [4713 BC, 5874897 AD]? Limit them to stamps
boundaries or leave them as is and forbid a conversion if they don't
fit into stamps?

There is also trouble with intervals. Currently it is impossible to do
(even without the overflow on extracting):

postgres=# select extract (epoch from '294277-01-09
04:00:54.775806+00'::timestamptz);
date_part
--
 9224318721654.78
(1 row)

postgres=# select to_timestamp(9224318721654.775390625); -- because of
..654.78 is rounded up; but we know the exact value
  to_timestamp
-
 294277-01-09 04:00:54.77539+00
(1 row)

... you get the error:

postgres=# select to_timestamp(9224318721654.775390625);
ERROR:  interval out of range
CONTEXT:  SQL function "to_timestamp" statement 

  1   2   >