date:20160402

Alex Shulgin  writes:
> On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane  wrote:
>> Well, we have to do *something* with the last (possibly only) value.
>> Neither "include always" nor "omit always" seem sane to me.  What other
>> decision rule do you want there?

> Well, what implies that the last value is somehow special?  I would think
> we should just do with it whatever we do with the rest of the candidate
> MCVs.

Sure, but both of the proposed decision rules break down when there are no
values after the one under consideration.  We need to do something sane
there.

> For "the only value" case: we cannot build a histogram out of a single
> value, so omitting it from MCVs is not a good strategy, ISTM.
> From my point of view that amounts to "include always".

If there is only one value, it will have 100% of the samples, so it would
get included under just about any decision rule (other than "more than
100% of this value plus following values").  I don't think making sure
this case works is sufficient to get us to a reasonable rule --- it's
a necessary case, but not a sufficient case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

On Sun, Apr 3, 2016 at 7:18 AM, Tom Lane  wrote:

> Alex Shulgin  writes:
> > On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin 
> wrote:
> >> I'm not sure yet about the 1% rule for the last value, but would also
> love
> >> to see if we can avoid the arbitrary limit here.  What happens with a
> last
> >> value which is less than 1% popular in the current code anyway?
>
> > Now that I think about it, I don't really believe this arbitrary
> heuristic
> > is any good either, sorry.
>
> Yeah, it was just a placeholder to produce a working patch.
>
> Maybe we could base this cutoff on the stats target for the column?
> That is, "1%" would be the right number if stats target is 100,
> otherwise scale appropriately.
>
> > What was your motivation to introduce some limit at the bottom anyway?
>
> Well, we have to do *something* with the last (possibly only) value.
> Neither "include always" nor "omit always" seem sane to me.  What other
> decision rule do you want there?
>

Well, what implies that the last value is somehow special?  I would think
we should just do with it whatever we do with the rest of the candidate
MCVs.

For "the only value" case: we cannot build a histogram out of a single
value, so omitting it from MCVs is not a good strategy, ISTM.

>From my point of view that amounts to "include always".  What problems do
you see with this approach exactly?

--
Alex

Re: [HACKERS] More stable query plans via more predictable column statistics

Alex Shulgin  writes:
> On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin  wrote:
>> I'm not sure yet about the 1% rule for the last value, but would also love
>> to see if we can avoid the arbitrary limit here.  What happens with a last
>> value which is less than 1% popular in the current code anyway?

> Now that I think about it, I don't really believe this arbitrary heuristic
> is any good either, sorry.

Yeah, it was just a placeholder to produce a working patch.

Maybe we could base this cutoff on the stats target for the column?
That is, "1%" would be the right number if stats target is 100,
otherwise scale appropriately.

> What was your motivation to introduce some limit at the bottom anyway?

Well, we have to do *something* with the last (possibly only) value.
Neither "include always" nor "omit always" seem sane to me.  What other
decision rule do you want there?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

Alex Shulgin  writes:
> What about regression tests?  My assumption was that we won't be able to
> add them with the usual expected file approach, but that we also don't need
> it that hard.  Everyone's in favor?

It'd be nice to have a regression test, but I concur that the LOCATION
information is too unstable for that to be practical.

We could imagine testing some variant behavior that omits printing
LOCATION, but that doesn't really seem worth the trouble; I think
we'd be inventing the variant behavior only for testing purposes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Refer to a TOKEN_USER payload as a "token user, " not as a "user

Stephen Frost  writes:
> * Noah Misch (n...@leadboat.com) wrote:
>> The wording in GetTokenUser() and AddUserToTokenDacl() seems fine; let's
>> standardize on that.  Also, every GetTokenUser() failure has been yielding 
>> two
>> messages, the second contributing no further detail.  I'll reduce that to the
>> usual one message per failure.

> This approach works for me.

OK by me, too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

On Sun, Apr 3, 2016 at 3:43 AM, Alex Shulgin  wrote:

>
> I'm not sure yet about the 1% rule for the last value, but would also love
> to see if we can avoid the arbitrary limit here.  What happens with a last
> value which is less than 1% popular in the current code anyway?
>

Tom,

Now that I think about it, I don't really believe this arbitrary heuristic
is any good either, sorry.  What if you have a value that is just a bit
under 1% popular, but is being used in 50% of your queries in WHERE clause
with equality comparison?  Without this value in the MCV list the planner
will likely use SeqScan instead of an IndexScan that might be more
appropriate here.  I think we are much better off if we don't touch this
aspect of the current code.

What was your motivation to introduce some limit at the bottom anyway?  If
that was to prevent accidental division by zero, then an explicit check on
denominator not being 0 seems to me like a better safeguard than this.

Regards.
--
Alex

Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

On Sat, Apr 2, 2016 at 11:41 PM, Tom Lane  wrote:

> "Shulgin, Oleksandr"  writes:
> > On Mon, Mar 14, 2016 at 7:55 PM, Tom Lane  wrote:
> >> Yeah, I don't much like that either.  But I don't think we can avoid
> >> some refactoring there; as designed, conversion of an error message into
> >> user-visible form is too tightly tied to receipt of the message.
>
> > True.  Attached is a v2 which addresses all of the points raised earlier
> I
> > believe.
>
> I took a closer look at what's going on here and realized that actually
> it's not that hard to decouple the message-building routine from the
> PGconn state, because mostly it works with fields it's extracting out
> of the PGresult anyway.  The only piece of information that's lacking
> is conn->last_query.  I propose therefore that instead of doing it like
> this, we copy last_query into error PGresults.  This is strictly less
> added storage requirement than storing the whole verbose message would be,
> and it saves time compared to the v2 patch in the typical case where
> the application never does ask for an alternately-formatted error message.
> Plus we can actually support requests for any variant format, not only
> VERBOSE.
>
> Attached is a libpq-portion-only version of a patch doing it this way.
> I've not yet looked at the psql part of your patch.
>
> Comments?
>

Ah, neat, that's even better. :-)

What about regression tests?  My assumption was that we won't be able to
add them with the usual expected file approach, but that we also don't need
it that hard.  Everyone's in favor?

--
Alex

Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2016-04-02 Thread Dilip Kumar

On Fri, Apr 1, 2016 at 2:09 PM, Andres Freund  wrote:

> One interesting thing to do would be to use -P1 during the test and see
> how much the performance varies over time.
>

I have run with -P option, I ran for 1200 second and set -P as 30 second,
and what I observed is that when its low its low throughout the run and
when its high, Its high for complete run.

Non Default Parameter:

shared_buffer 8GB
Max Connections 150

./pgbench -c $threads -j $threads -T 1200 -M prepared -S -P 30 postgres

Test results are attached in result.tar file.

File details:
--
   1. head_run1.txt   --> 20 mins run on head
reading 1
   2. head_run2.txt   --> 20 mins run on head
reading 2
   3. head_patch_run1.txt--> 20 mins run on head + patch
reading 1
   4. head_patch_run2.txt--> 20 mins run on head + patch
reading 2
   5. head_pinunpin.txt--> 20 mins run on head +
pinunpin-cas-8.patch
   6. head_pinunpin_patch.txt --> 20 mins run on head +
pinunpin-cas-8.patch + patch reading

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


result.tar
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 9.6 Feature Freeze - April 8, 2016

2016-04-02 Thread Noah Misch

On Tue, Mar 15, 2016 at 11:35:31AM -0400, Robert Haas wrote:
> At the recent PostgreSQL developer meeting in Brussels, a consensus
> was reached that an early beta, leading to an on-time release, would
> be very desirable.  In particular, it was suggested that we should
> attempt to release PostgreSQL 9.6beta1 in May.  The release management
> team has determined that this will require a feature freeze in early
> April.  Accordingly, the release management has decided that all
> feature patches destined for PostgreSQL 9.6 must be committed no later
> than April 8, 2016.  Any patch not committed prior to 2016-04-09
> 00:00:00 GMT may not be committed to PostgreSQL 9.6 unless (a) it is a
> bug fix, (b) it represents essential cleanup of a previously-committed
> patch, or (c) the release management team has approved an extension to
> the deadline for that particular patch.
> 
> As of the time when feature freeze goes into effect, all patches
> remaining in the CommitFest will be either moved to the next
> CommitFest, if they are as of that time in a state of Needs Review or
> Ready for Committer; or marked as Returned with Feedback, if they are
> in a state of Waiting on Author.  The release management team
> encourages patch authors and reviewers, the CommitFest manager (David
> Steele), and all other community members to keep the status of each
> patch in the CommitFest accurate throughout the CommitFest, and
> particularly as the feature freeze deadline approaches.
> 
> The release management team anticipates approving exceptions to the
> feature freeze deadline only if the following criteria are met: (1)
> the extension is requested by a committer who actually intends to
> commit the patch before the extension lapses; (2) the release
> management team believes that the patch has sufficient consensus to
> proceed with it; (3) the release management team believes that
> accepting the patch will not destabilize the tree or otherwise
> compromise the PostgreSQL community's ability to get a beta out the
> door on schedule; and (4) the proposed extension does not extend
> beyond April 15, 2016.

Reminder: the PostgreSQL 9.6 feature freeze takes effect in under six days, at
2016-04-09 00:00:00 GMT.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Refer to a TOKEN_USER payload as a "token user, " not as a "user

2016-04-02 Thread Stephen Frost

* Noah Misch (n...@leadboat.com) wrote:
> On Fri, Apr 01, 2016 at 11:07:01PM -0400, Tom Lane wrote:
> > Stephen Frost  writes:
> > > * Noah Misch (n...@leadboat.com) wrote:
> > >> I see some advantages of writing "TokenUser", as you propose.  However, 
> > >> our
> > >> error style guide says "Avoid mentioning called function names, either;
> > >> instead say what the code was trying to do."  Mentioning an enumerator 
> > >> name is
> > >> morally similar to mentioning a function name.
> > 
> > > That's a fair point, but it doesn't mean we should use a different
> > > spelling for the enumerator name to avoid that piece of the policy.  I
> > > certianly don't see "token user" as saying "what the code was trying to
> > > do" in this case.
> > 
> > FWIW, "token user" conveys entirely inappropriate, politically incorrect
> > connotations to me ;-).  I don't have any great suggestions on what to use
> > instead, but I share Stephen's unhappiness with the wording as-committed.
> 
> The wording in GetTokenUser() and AddUserToTokenDacl() seems fine; let's
> standardize on that.  Also, every GetTokenUser() failure has been yielding two
> messages, the second contributing no further detail.  I'll reduce that to the
> usual one message per failure.

This approach works for me.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Re: [COMMITTERS] pgsql: Refer to a TOKEN_USER payload as a "token user, " not as a "user

2016-04-02 Thread Noah Misch

On Fri, Apr 01, 2016 at 11:07:01PM -0400, Tom Lane wrote:
> Stephen Frost  writes:
> > * Noah Misch (n...@leadboat.com) wrote:
> >> I see some advantages of writing "TokenUser", as you propose.  However, our
> >> error style guide says "Avoid mentioning called function names, either;
> >> instead say what the code was trying to do."  Mentioning an enumerator 
> >> name is
> >> morally similar to mentioning a function name.
> 
> > That's a fair point, but it doesn't mean we should use a different
> > spelling for the enumerator name to avoid that piece of the policy.  I
> > certianly don't see "token user" as saying "what the code was trying to
> > do" in this case.
> 
> FWIW, "token user" conveys entirely inappropriate, politically incorrect
> connotations to me ;-).  I don't have any great suggestions on what to use
> instead, but I share Stephen's unhappiness with the wording as-committed.

The wording in GetTokenUser() and AddUserToTokenDacl() seems fine; let's
standardize on that.  Also, every GetTokenUser() failure has been yielding two
messages, the second contributing no further detail.  I'll reduce that to the
usual one message per failure.

nm
diff --git a/src/common/exec.c b/src/common/exec.c
index ec8c655..d736b02 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -674,10 +674,7 @@ AddUserToTokenDacl(HANDLE hToken)
 
/* Get the current user SID */
if (!GetTokenUser(hToken, ))
-   {
-   log_error("could not get token user: error code %lu", 
GetLastError());
-   goto cleanup;
-   }
+   goto cleanup;   /* callee printed a message */
 
/* Figure out the size of the new ACL */
dwNewAclSize = asi.AclBytesInUse + sizeof(ACCESS_ALLOWED_ACE) +
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 2751183..f21056e 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -1242,8 +1242,8 @@ pg_SSPI_recvauth(Port *port)
 
if (!GetTokenInformation(token, TokenUser, NULL, 0, ) && 
GetLastError() != 122)
ereport(ERROR,
-   (errmsg_internal("could not get token user size: error 
code %lu",
-GetLastError(;
+   (errmsg_internal("could not get token 
information buffer size: error code %lu",
+
GetLastError(;
 
tokenuser = malloc(retlen);
if (tokenuser == NULL)
@@ -1252,8 +1252,8 @@ pg_SSPI_recvauth(Port *port)
 
if (!GetTokenInformation(token, TokenUser, tokenuser, retlen, ))
ereport(ERROR,
-   (errmsg_internal("could not get token user: 
error code %lu",
-
GetLastError(;
+ (errmsg_internal("could not get token information: error code 
%lu",
+  GetLastError(;
 
CloseHandle(token);
 
diff --git a/src/port/win32security.c b/src/port/win32security.c
index d3eba9a..ab9cd67 100644
--- a/src/port/win32security.c
+++ b/src/port/win32security.c
@@ -248,14 +248,14 @@ pgwin32_get_dynamic_tokeninfo(HANDLE token, 
TOKEN_INFORMATION_CLASS class,
if (GetTokenInformation(token, class, NULL, 0, ))
{
snprintf(errbuf, errsize,
-"could not get token information: got zero 
size\n");
+"could not get token information buffer size: got zero 
size\n");
return FALSE;
}
 
if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
{
snprintf(errbuf, errsize,
-"could not get token information: error code 
%lu\n",
+"could not get token information buffer size: error 
code %lu\n",
 GetLastError());
return FALSE;
}
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 416829d..dcafcf1 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -904,7 +904,7 @@ current_windows_user(const char **acct, const char **dom)
if (!GetTokenInformation(token, TokenUser, NULL, 0, ) && 
GetLastError() != 122)
{
fprintf(stderr,
-   _("%s: could not get token user size: error 
code %lu\n"),
+   _("%s: could not get token information buffer 
size: error code %lu\n"),
progname, GetLastError());
exit(2);
}
@@ -912,7 +912,7 @@ current_windows_user(const char **acct, const char **dom)
if (!GetTokenInformation(token, TokenUser, tokenuser, retlen, ))
{
fprintf(stderr,
-   _("%s: could not get token user: error code 
%lu\n"),

Re: [HACKERS] More stable query plans via more predictable column statistics

On Sat, Apr 2, 2016 at 8:57 PM, Shulgin, Oleksandr <
oleksandr.shul...@zalando.de> wrote:
> On Apr 2, 2016 18:38, "Tom Lane"  wrote:
>
>> I did not like the fact that the compute_scalar_stats logic
>> would allow absolutely anything into the MCV list once num_hist falls
>> below 2. I think it's important that we continue to reject values that
>> are only seen once in the sample, because there's no very good reason to
>> think that they are MCVs and not just infrequent values that by luck
>> appeared in the sample.
>
> In my understanding we only put a value in the track list if we've seen it
> at least twice, no?

This is actually the case for compute_scalar_stats, but not for
compute_distinct_stats.  In the latter case we can still have
track[i].count == 1, but we can also break out of the loop if we see the
first tracked item like that.

>> Before I noticed the regression failure, I'd been thinking that maybe
it'd
>> be better if the decision rule were not "at least 100+x% of the average
>> frequency of this value and later ones", but "at least 100+x% of the
>> average frequency of values after this one".
>
> Hm, sounds pretty similar to what I wanted to achieve, but better
> formalized.
>
>> With that formulation, we're
>> not constrained as to the range of x.  Now, if there are *no* values
after
>> this one, then this way needs an explicit special case in order not to
>> compute 0/0; but the preceding para shows that we need a special case for
>> the last value anyway.
>>
>> So, attached is a patch rewritten along these lines.  I used 50% rather
>> than 25% as the new cutoff percentage --- obviously it should be higher
>> in this formulation than before, but I have no idea if that particular
>> number is good or we should use something else.  Also, the rule for the
>> last value is "at least 1% of the non-null samples".  That's a pure guess
>> as well.
>>
>> I do not have any good corpuses of data to try this on.  Can folks who
>> have been following this thread try it on their data and see how it
>> does?  Also please try some other multipliers besides 1.5, so we can
>> get a feeling for where that cutoff should be placed.
>
> Expect me to run it on my pet db early next week. :-)

I was trying to come up with some examples where 50% could be a good or a
bad choice and then I noticed that we might be able to turn it it the other
way round: instead of inventing an arbitrary limit at the MCVs frequency we
could use the histogram as the criteria for a candidate MCV to be
considered "common enough".  If we can prove that the value would produce
duplicates in the histogram, we should rather put it in the MCV list
(unless it's already fully occupied, then we can't do anything).

A value is guaranteed to produce a duplicate if it has appeared at least
2*hfreq+1 times in the sample (hfreq from your version of the patch, which
is recalculated on every loop iteration).  I could produce an updated patch
on Monday or anyone else following this discussion should be able to do
that.

This approach would be a huge win in my opinion, because this way we can
avoid all the arbitrariness of that .25 / .50 multiplier.  Otherwise there
might be (valid) complaints that for some data .40 (or .60) is a better
fit, but we have already hard-coded something and there would be no easy
way to improve situation for some users while avoiding to break it for the
rest (unless we introduce a per-attribute configurable parameter like
statistics_target for this multiplier, which I'd like to avoid even
thinking about ;-)

While we don't (well, can't) build a histogram in the
compute_distinct_stats variant, we could also apply the above mind trick
there for the same reason and to make the output of both functions more
consistent (and to have less maintenance burden between the variants).  And
anyway it would be rather surprising to see that depending on the presence
of an order operator for the type, the resulting MCV lists after the
ANALYZE would be different (I mean not only due to the random nature of the
sample).

I'm not sure yet about the 1% rule for the last value, but would also love
to see if we can avoid the arbitrary limit here.  What happens with a last
value which is less than 1% popular in the current code anyway?

Cheers!
--
Alex

Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

On Thu, Feb 11, 2016 at 9:50 AM, Robert Haas  wrote:
>> Actually, what'd be really handy IMO is something to regurgitate the
>> most recent error in verbose mode, without making a permanent session
>> state change.  Something like
>>
>> regression=# insert into bar values(1);
>> ERROR:  insert or update on table "bar" violates foreign key constraint 
>> "bar_f1_fkey"
>> DETAIL:  Key (f1)=(1) is not present in table "foo".
>> regression=# \saywhat
>> ERROR:  23503: insert or update on table "bar" violates foreign key 
>> constraint "bar_f1_fkey"
>> DETAIL:  Key (f1)=(1) is not present in table "foo".
>> SCHEMA NAME:  public
>> TABLE NAME:  bar
>> CONSTRAINT NAME:  bar_f1_fkey
>> LOCATION:  ri_ReportViolation, ri_triggers.c:3326
>> regression=#
>
> Wow, that's a fabulous idea.  I see Oleksandr has tried to implement
> it, although I haven't looked at the patch.  But I think this would be
> REALLY helpful.

+1


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

On Sat, Apr 2, 2016 at 3:22 PM, Peter Geoghegan  wrote:
> On Sat, Apr 2, 2016 at 3:20 PM, Greg Stark  wrote:
>> There are also some weird cases in this list where there's a
>> significant regression at 32MB but not at 8MB. I would like to see
>> 16MB and perhaps 12MB and 24MB. They would help understand if these
>> are just quirks or there's a consistent pattern.
>
> I'll need to drill down to trace_sort output to see what happened there.

I looked into this.

I too noticed that queries like "SELECT a FROM int_test UNION SELECT a
FROM int_test_padding" looked strangely faster for 128MB +
high_cardinality_almost_asc + i5 for master branch. This made the
patch look relatively bad for the test with those exact properties
only; the patch was faster with both lower and higher work_mem
settings than 128MB. There was a weird spike in performance for the
master branch only.

Having drilled down to trace_sort output, I think I know roughly why.
I see output like this:

1459308434.753 2016-03-30 05:27:14 CEST STATEMENT:  SELECT * FROM
(SELECT a FROM int_test UNION SELECT a FROM int_test_padding OFFSET
1e10) ff;

I think that this is invalid, because the query was intended as this:

SELECT * FROM (SELECT * FROM (SELECT a FROM int_test UNION SELECT a
FROM int_test_padding) gg OFFSET 1e10) ff;

This would have controlled for client overhead, per my request to
Tomas, without altering the "underlying query" that you see in the
final spreadsheet. I don't have an exact explanation for why you'd see
this spike at 128MB for the master branch but not the other at the
moment, but it seems like that one test is basically invalid, and
should be discarded. I suspect that the patch didn't see its own
similar spike due to my changes to cost_sort(), which reflected that
sorts don't need to do so much expensive random I/O.

This is the only case that I saw that was not more or less consistent
with my expectations, which is good.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

On Sat, Apr 2, 2016 at 7:31 AM, Tomas Vondra
 wrote:
> So, I do have the results from both machines - I've attached the basic
> comparison spreadsheets, the complete summary is available here:
>
>https://github.com/tvondra/sort-benchmark
>
> The database log also includes the logs for trace_sort=on for each query
> (use the timestamp logged for each query in the spreadsheet to locate the
> right section of the log).

Thanks!

Each row in these spreadsheets shows what looks like a multimodal
distribution for the patch (if you focus on the actual run times, not
the ratios). IOW, you can clearly see the regressions are only where
master has its best case, and the patch its worst case; as the
work_mem increases for each benchmark case for the patch, by far the
largest improvement is usually seen as we cross the CPU cache
threshold. Master gets noticeably slower as work_mem goes from 8MB to
32MB, but the patch gets far far faster. Things continue to improve
for patched in absolute terms and especially relative to master
following further increases in work_mem, but not nearly as
dramatically as that first increment (unless we have lots of padding,
which makes the memtuples heap itself much smaller, so it happens one
step later). Master shows a slow decline at and past 32MB of work_mem.
If the test hardware had a larger L3 cache, we might expect to notice
a second big drop, but this hardware doesn't have the enormous L3
cache sizes of new Xeon processors (e.g. 32MB, 45MB).

> While it might look like I'm somehow opposed to this patch series, that's
> mostly because we tend to look only at the few cases that behave poorly.
>
> So let me be clear: I do think the patch seems to be a significant
> performance improvement for most of the queries, and I'm OK with accepting a
> few regressions (particularly if we agree those are pathological cases,
> unlikely to happen in real-world workloads).
>
> It's quite rare that a patch is a universal win without regressions, so it's
> important to consider how likely those regressions are and what's the net
> effect of the patch - and the patch seems to be a significant improvement in
> most cases (and regressions limited to pathological or rare corner cases).
>
> I don't think those are reasons not to push this into 9.6.

I didn't think that you opposed the patch. In fact, you did the right
thing by focussing on the low-end regressions, as I've said. I was
probably too concerned about Robert failing to consider that they were
not representative, particularly with regard to how small the
memtuples heap could be relative to the CPU cache; blame it on how
close I've become to this problem. I'm pretty confident that Robert
can be convinced that these do not matter enough to not commit the
patch. In any case, I'm pretty confident that I cannot fix any
remaining regressions.

> I haven't done any thorough investigation of the results yet, but in general
> it seems the results from both machines are quite similar - the numbers are
> different, but the speedup/slowdown patterns are mostly the same (with some
> exceptions that I'd guess are due to HW differences).

I agree. What we clearly see is the advantages of quicksort being
cache oblivious, especially relative to master's use of a heap. That
advantage becomes pronounced at slightly different points in each
case, but the overall picture is the same. This pattern demonstrates
why a cache oblivious algorithm is so useful in general -- we don't
have to care about tuning for that. As important as this is for serial
sorts, it's even more important for parallel sorts, where parallel
workers compete for memory bandwidth, and where it's practically
impossible to build a cost model for CPU cache size + memory use +
nworkers.

> The slowdown/speedup patterns (red/green cells in the spreadheets) are also
> similar to those collected originally. Some timings are much lower,
> presumably thanks to using the "OFFSET 1e10" pattern, but the patterns are
> the same.

I think it's notable that this made things more predictable, and made
the benefits clearer.

> The one thing that surprised me a bit is that
>
> replacement_sort_mem=64
>
> actually often made the results considerably worse in many cases. A common
> pattern is that the slowdown "spreads" to nearby cells - the are many
> queries where the 8MB case is 1:1 with master and 32MB is 1.5:1 (i.e. takes
> 1.5x more time), and setting replacement_sort_mem=64 just slows down the 8MB
> case.
>
> In general, replacement_sort_mem=64 seems to only affect the 8MB case, and
> in most cases it results in 100% slowdown (so 2x as long queries).

To be clear, for the benefit of other people: replacement_sort_mem=64
makes the patch never use a replacement selection heap, even at the
lowest tested work_mem setting of 8MB.

This is exactly what I expected. When replacement_sort_mem is the
proposed default of 16MB, it literally has zero impact on how the
patch behaves

Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

I wrote:
> Attached is a libpq-portion-only version of a patch doing it this way.
> I've not yet looked at the psql part of your patch.

Here's an update for the psql side.

regards, tom lane

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 385cb59..330127a 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** Tue Oct 26 21:40:57 CEST 1999
*** 1718,1723 
--- 1718,1737 
  
  

+ \errverbose
+ 
+ 
+ 
+ Repeats the most recent server error or notice message at maximum
+ verbosity, as though VERBOSITY were set
+ to verbose and SHOW_CONTEXT were
+ set to always.
+ 
+ 
+   
+ 
+ 
+   
  \f [ string ]
  
  
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 50dc43b..3401b51 100644
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** exec_command(const char *cmd,
*** 822,827 
--- 822,849 
  		}
  	}
  
+ 	/* \errverbose -- display verbose message from last failed query */
+ 	else if (strcmp(cmd, "errverbose") == 0)
+ 	{
+ 		if (pset.last_error_result)
+ 		{
+ 			char	   *msg;
+ 
+ 			msg = PQresultVerboseErrorMessage(pset.last_error_result,
+ 			  PQERRORS_VERBOSE,
+ 			  PQSHOW_CONTEXT_ALWAYS);
+ 			if (msg)
+ 			{
+ psql_error("%s", msg);
+ PQfreemem(msg);
+ 			}
+ 			else
+ puts(_("out of memory"));
+ 		}
+ 		else
+ 			puts(_("There was no previous error."));
+ 	}
+ 
  	/* \f -- change field separator */
  	else if (strcmp(cmd, "f") == 0)
  	{
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 892058e..a2a07fb 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** AcceptResult(const PGresult *result)
*** 497,502 
--- 497,529 
  }
  
  
+ /*
+  * ClearOrSaveResult
+  *
+  * If the result represents an error, remember it for possible display by
+  * \errverbose.  Otherwise, just PQclear() it.
+  */
+ static void
+ ClearOrSaveResult(PGresult *result)
+ {
+ 	if (result)
+ 	{
+ 		switch (PQresultStatus(result))
+ 		{
+ 			case PGRES_NONFATAL_ERROR:
+ 			case PGRES_FATAL_ERROR:
+ if (pset.last_error_result)
+ 	PQclear(pset.last_error_result);
+ pset.last_error_result = result;
+ break;
+ 
+ 			default:
+ PQclear(result);
+ break;
+ 		}
+ 	}
+ }
+ 
  
  /*
   * PSQLexec
*** PSQLexec(const char *query)
*** 548,554 
  
  	if (!AcceptResult(res))
  	{
! 		PQclear(res);
  		res = NULL;
  	}
  
--- 575,581 
  
  	if (!AcceptResult(res))
  	{
! 		ClearOrSaveResult(res);
  		res = NULL;
  	}
  
*** PSQLexecWatch(const char *query, const p
*** 590,596 
  
  	if (!AcceptResult(res))
  	{
! 		PQclear(res);
  		return 0;
  	}
  
--- 617,623 
  
  	if (!AcceptResult(res))
  	{
! 		ClearOrSaveResult(res);
  		return 0;
  	}
  
*** SendQuery(const char *query)
*** 1077,1087 
  		if (PQresultStatus(results) != PGRES_COMMAND_OK)
  		{
  			psql_error("%s", PQerrorMessage(pset.db));
! 			PQclear(results);
  			ResetCancelConn();
  			goto sendquery_cleanup;
  		}
! 		PQclear(results);
  		transaction_status = PQtransactionStatus(pset.db);
  	}
  
--- 1104,1114 
  		if (PQresultStatus(results) != PGRES_COMMAND_OK)
  		{
  			psql_error("%s", PQerrorMessage(pset.db));
! 			ClearOrSaveResult(results);
  			ResetCancelConn();
  			goto sendquery_cleanup;
  		}
! 		ClearOrSaveResult(results);
  		transaction_status = PQtransactionStatus(pset.db);
  	}
  
*** SendQuery(const char *query)
*** 1102,1112 
  			if (PQresultStatus(results) != PGRES_COMMAND_OK)
  			{
  psql_error("%s", PQerrorMessage(pset.db));
! PQclear(results);
  ResetCancelConn();
  goto sendquery_cleanup;
  			}
! 			PQclear(results);
  			on_error_rollback_savepoint = true;
  		}
  	}
--- 1129,1139 
  			if (PQresultStatus(results) != PGRES_COMMAND_OK)
  			{
  psql_error("%s", PQerrorMessage(pset.db));
! ClearOrSaveResult(results);
  ResetCancelConn();
  goto sendquery_cleanup;
  			}
! 			ClearOrSaveResult(results);
  			on_error_rollback_savepoint = true;
  		}
  	}
*** SendQuery(const char *query)
*** 1202,1208 
  			if (PQresultStatus(svptres) != PGRES_COMMAND_OK)
  			{
  psql_error("%s", PQerrorMessage(pset.db));
! PQclear(svptres);
  OK = false;
  
  PQclear(results);
--- 1229,1235 
  			if (PQresultStatus(svptres) != PGRES_COMMAND_OK)
  			{
  psql_error("%s", PQerrorMessage(pset.db));
! ClearOrSaveResult(svptres);
  OK = false;
  
  PQclear(results);
*** SendQuery(const char *query)
*** 1213,1219 
  		}
  	}
  
! 	PQclear(results);
  
  	/* Possible microtiming output */
  	if (pset.timing)
--- 1240,1246 
  		}
  	}
  
! 	ClearOrSaveResult(results);
  
  	/* Possible microtiming output */
  	if (pset.timing)
***

Re: [HACKERS] Using quicksort for every external sort run

On Sat, Apr 2, 2016 at 3:20 PM, Greg Stark  wrote:
> These are the averages across all queries across all data sets for the
> run-time for the patch versus master (not patched 64 which I think is
> the replacement_sort_mem=64MB which appears to not be a win). So even
> in the less successful cases on average quicksort is faster than
> replacement selection.

It's actually replacement_sort_mem=64 (64KB -- effectively disabled).
So where that case does better or worse, which can only be when
work_mem=8MB in practice, that's respectively good or bad for
replacement selection. So, typically RS does better when there are
presorted inputs with a positive (not inverse/DESC) correlation, and
there is little work_mem. As I've said, this is where the CPU cache is
large enough to fit the entire memtuples heap.

"Padded" cases are mostly bad because they make the memtuples heap
relatively small in each case. So with work_mem=32MB, you get a
memtuples heap structure similar to work_mem=8MB. The padding pushes
things out a bit further, which favors master.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

On Sat, Apr 2, 2016 at 3:20 PM, Greg Stark  wrote:
> There are also some weird cases in this list where there's a
> significant regression at 32MB but not at 8MB. I would like to see
> 16MB and perhaps 12MB and 24MB. They would help understand if these
> are just quirks or there's a consistent pattern.

I'll need to drill down to trace_sort output to see what happened there.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

2016-04-02 Thread Greg Stark

On Sat, Apr 2, 2016 at 3:31 PM, Tomas Vondra
 wrote:

> So let me be clear: I do think the patch seems to be a significant
> performance improvement for most of the queries, and I'm OK with accepting a
> few regressions (particularly if we agree those are pathological cases,
> unlikely to happen in real-world workloads).

The ultra-short version of this is:

8MB: 0.98
32MB:   0.79
128MB: 0.63
512MB: 0.51
1GB: 0.42

These are the averages across all queries across all data sets for the
run-time for the patch versus master (not patched 64 which I think is
the replacement_sort_mem=64MB which appears to not be a win). So even
in the less successful cases on average quicksort is faster than
replacement selection.

But selecting just the cases where 8MB is significantly slower than
master it does look like the "padding" data sets are endemic.

On the one hand that's a very realistic use-case where I think a lot
of users find themselves. I know in my days as a web developer I
typically threw a lot of columns into my queries and through a lot of
joins and order by and then left it to the application to pick through
the recordsets that were returned for the columns that were of
interest. The tuples being sorted were probably huge.

On the other hand perhaps this is something better tackled by the
planner. If the planner can arrange sorts to happen when the rows are
narrower that would be a a bigger win than trying to move a lot of
data around like this. (In the extreme if it were possible to replace
unnecessary columns by the tid and then refetching them later though
that's obviously more than a little tricky to do effectively.)

There are also some weird cases in this list where there's a
significant regression at 32MB but not at 8MB. I would like to see
16MB and perhaps 12MB and 24MB. They would help understand if these
are just quirks or there's a consistent pattern.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] standalone backend PANICs during recovery

2016-04-02 Thread Alvaro Herrera

Bernd Helmle wrote:
> While investigating a problem on a streaming hot standby instance at a
> customer site, i got the following when using a standalone backend:
> 
> PANIC:  btree_xlog_delete_get_latestRemovedXid: cannot operate with
> inconsistent data
> CONTEXT:  xlog redo delete: index 1663/65588/65625; iblk 11, heap
> 1663/65588/65613;
> 
> The responsible PANIC is triggered here:
> 
> src/backend/access/nbtree/nbtxlog.c:555
> 
> btree_xlog_delete_get_latestRemovedXid(XLogReaderState *record)

This PANIC was introduced in the commit below.  AFAICS your proposed
patch and rationale are correct.

commit f786e91a75b2f64527dcf321e754b6448fcad7fe
Author: Tom Lane 
AuthorDate: Fri Aug 3 15:41:18 2012 -0400
CommitDate: Fri Aug 3 15:41:18 2012 -0400

Improve underdocumented btree_xlog_delete_get_latestRemovedXid() code.

As noted by Noah Misch, btree_xlog_delete_get_latestRemovedXid is
critically dependent on the assumption that it's examining a consistent
state of the database.  This was undocumented though, so the
seemingly-unrelated check for no active HS sessions might be thought to be
merely an optional optimization.  Improve comments, and add an explicit
check of reachedConsistency just to be sure.

This function returns InvalidTransactionId (thereby killing all HS
transactions) in several cases that are not nearly unlikely enough for my
taste.  This commit doesn't attempt to fix those deficiencies, just
document them.

Back-patch to 9.2, not from any real functional need but just to keep the
branches more closely synced to simplify possible future back-patching.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

"Shulgin, Oleksandr"  writes:
> On Mon, Mar 14, 2016 at 7:55 PM, Tom Lane  wrote:
>> Yeah, I don't much like that either.  But I don't think we can avoid
>> some refactoring there; as designed, conversion of an error message into
>> user-visible form is too tightly tied to receipt of the message.

> True.  Attached is a v2 which addresses all of the points raised earlier I
> believe.

I took a closer look at what's going on here and realized that actually
it's not that hard to decouple the message-building routine from the
PGconn state, because mostly it works with fields it's extracting out
of the PGresult anyway.  The only piece of information that's lacking
is conn->last_query.  I propose therefore that instead of doing it like
this, we copy last_query into error PGresults.  This is strictly less
added storage requirement than storing the whole verbose message would be,
and it saves time compared to the v2 patch in the typical case where
the application never does ask for an alternately-formatted error message.
Plus we can actually support requests for any variant format, not only
VERBOSE.

Attached is a libpq-portion-only version of a patch doing it this way.
I've not yet looked at the psql part of your patch.

Comments?

regards, tom lane

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 2328d8f..80f7014 100644
*** a/doc/src/sgml/libpq.sgml
--- b/doc/src/sgml/libpq.sgml
*** char *PQresultErrorMessage(const PGresul
*** 2691,2696 
--- 2691,2738 

   
  
+  
+   
+PQresultVerboseErrorMessage
+
+ PQresultVerboseErrorMessage
+
+   
+ 
+   
+
+ Returns a reformatted version of the error message associated with
+ a PGresult object.
+ 
+ char *PQresultVerboseErrorMessage(const PGresult *res,
+   PGVerbosity verbosity,
+   PGContextVisibility show_context);
+ 
+ In some situations a client might wish to obtain a more detailed
+ version of a previously-reported error.
+ PQresultVerboseErrorMessage addresses this need
+ by computing the message that would have been produced
+ by PQresultErrorMessage if the specified
+ verbosity settings had been in effect for the connection when the
+ given PGresult was generated.  If
+ the PGresult is not an error result,
+ PGresult is not an error result is reported instead.
+ The returned string includes a trailing newline.
+
+ 
+
+ Unlike most other functions for extracting data from
+ a PGresult, the result of this function is a freshly
+ allocated string.  The caller must free it
+ using PQfreemem() when the string is no longer needed.
+
+ 
+
+ A NULL return is possible if there is insufficient memory.
+
+   
+  
+ 
   
PQresultErrorFieldPQresultErrorField

*** PGVerbosity PQsetErrorVerbosity(PGconn *
*** 5582,5587 
--- 5624,5631 
mode includes all available fields.  Changing the verbosity does not
affect the messages available from already-existing
PGresult objects, only subsequently-created ones.
+   (But see PQresultVerboseErrorMessage if you
+   want to print a previous error with a different verbosity.)
   
  
 
*** PGContextVisibility PQsetErrorContextVis
*** 5622,5627 
--- 5666,5673 
affect the messages available from
already-existing PGresult objects, only
subsequently-created ones.
+   (But see PQresultVerboseErrorMessage if you
+   want to print a previous error with a different display mode.)
   
  
 
*** PQsetNoticeProcessor(PGconn *conn,
*** 6089,6096 
 receiver function is called.  It is passed the message in the form of
 a PGRES_NONFATAL_ERROR
 PGresult.  (This allows the receiver to extract
!individual fields using PQresultErrorField, or the complete
!preformatted message using PQresultErrorMessage.) The same
 void pointer passed to PQsetNoticeReceiver is also
 passed.  (This pointer can be used to access application-specific state
 if needed.)
--- 6135,6143 
 receiver function is called.  It is passed the message in the form of
 a PGRES_NONFATAL_ERROR
 PGresult.  (This allows the receiver to extract
!individual fields using PQresultErrorField, or complete
!preformatted messages using PQresultErrorMessage or
!PQresultVerboseErrorMessage.) The same
 void pointer passed to PQsetNoticeReceiver is also
 passed.  (This pointer can be used to access application-specific state
 if needed.)
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index c69a4d5..21dd772 100644
***

Re: [HACKERS] Batch update of indexes

2016-04-02 Thread Konstantin Knizhnik

On 04/02/2016 09:57 PM, Tom Lane wrote:

Konstantin Knizhnik writes:

Attached please find patch for "ALTER INDEX ... WHERE ..." clause.
It is now able to handle all three possible situations:
1. Making index partial (add WHERE condition to the ordinary index)
2. Extend partial index range (less restricted index predicate)
3. Arbitrary change of partial index predicate

I've not been following this thread previously, but this proposal
scares me quite a lot. I am certain there are places in our code
that assume that the properties of an index don't change after it's
been created. One area that this almost certainly breaks is HOT updates:
adding a previously-unindexed column to an index predicate might break
existing HOT chains, and I see nothing in this patch that could deal
with that. I seem to recall there are other places that would be
broken by changing an index's DDL definition after creation, but can't
recall specifics right now.

I am also, frankly, not seeing a use-case for this functionality that
would justify trying to find and remove those assumptions.

There's a lot of things I don't care for about the way the patch is
written, in particular its willingness to use SPI (which opens a lot of
potential for search-path problems, failure to see uncommitted tuples,
etc). But we need not get to that if we don't believe the functionality
can work.

Thank you for review, Tom.

I completely agree with all your arguments against this patch.
I have proposed this patch mostly as prove of concept.
Yes, I have not take in account hot updates and may be there are other possible
issues which I not considered.

The main question is whether the proposed way of batch update of indexes is
viable or it is conceptually wrong approach
(because it beaks assumption that index properties can't be changed or because
it is not convenient to use...).

I hope that everybody agree that maintaining of indexes is the main limiting
factor for insert speed.
If table has no indexes, then insert speed can be as high as disk write speed
(100Mb/sec or 1000 for SSD).
So if size of record is about 10 bytes, then we can get about 10 millions TPS.
But presence of indexes will dramatically change this picture: if database is
large enough so that even index can not fit in memory
and records are inserted in random key order, then each insert in index will
require reading of 3-4 pages from random locations on the disk.
With average HDD positioning time 10 msec, we get 100 reads per second and ...
20-30 TPS. It is just with one index.
If we have 10 indexes, then TPS can be less than fingers on a hand.

Certainly it is very pessimistic estimation.
But still it is true that we can not provide good insert speed if we have to
update indexes immediately.
And without indexes we can not efficiently execute most of queries.

I do not see any way in Postgres to solve this problem now. The hack with
creating materialized views requires a lot of extra time and space.
It will not work for really large table.

So we need some way to postpone insertion of new records in the index. Then we can do such insertion in background or in idle time (at night), try to use bulk insert if index implementation supports it (for example sorting records by key before insert can
significantly increase locality and so improve speed of insert in index). But the principle moment here is that such delayed update of index violates the main RDBMS rule that results of query execution with and without indexes should be the same. The trick
with partial indexes allows to eliminate this contradiction. But it requires more actions from user. So are users ready to do some exatra job just because of "idealogical" reasons? Because if user wants to have delayed update of indexes, then he actually
approves that it is ok for him that query results may not include some most recent updates.

Another aspect is which database objects are allowed to be altered and which
not. Right now with tables we can alter almost everything.
With indexes - almost nothing. It is assumed that index can always be reconstructed. But for very big table reconstruction of indexes from scratch will take unacceptable amount of time. So should we make it possible to alter some index characteristics
which do not require to rebuild index from scratch (and it is definitely true for partial index predicate)? Or price of supporting it is so high, that it can not be compensated by obtained benefits?

So what do you think?
1. Should I continue work in this direction and fix all possible issues with
hot updates,... to make it possible to alter partial index predicates and
support batch inserts i this way?
2. Or it is better to just add extra option to the index, allowing it to be slightly out-of-sync? It will allow, for example, to eliminate pending list for GIN which can cause very significant degradation of query speed, while for most full-text search
engine it is

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-04-02 Thread Shulgin, Oleksandr

On Apr 2, 2016 18:38, "Tom Lane"  wrote:
>
> "Shulgin, Oleksandr"  writes:
> > On Apr 1, 2016 23:14, "Tom Lane"  wrote:
> >> Haven't looked at 0002 yet.
>
> > [crosses fingers] hope you'll have a chance to do that before feature
> > freeze for 9.6
>
> I studied this patch for awhile after rebasing it onto yesterday's
> commits.

Fantastic! I could not hope for a better reply :-)

> I did not like the fact that the compute_scalar_stats logic
> would allow absolutely anything into the MCV list once num_hist falls
> below 2. I think it's important that we continue to reject values that
> are only seen once in the sample, because there's no very good reason to
> think that they are MCVs and not just infrequent values that by luck
> appeared in the sample.

In my understanding we only put a value in the track list if we've seen it
at least twice, no?

> However, after I rearranged the tests there so
> that "if (num_hist >= 2)" only controlled whether to apply the 1/K limit,
> one of the regression tests started to fail:

Uh-oh.

> there's a place in
> rowsecurity.sql that expects that if a column contains nothing but several
> instances of a single value, that value will be recorded as a lone MCV.
> Now this isn't a particularly essential thing for that test, but it still
> seems like a good property for ANALYZE to have.

No objection here.

> The reason it's failing,
> of course, is that the test as written cannot possibly accept the last
> (or only) value.

Yeah, this I would expect from such a change.

> Before I noticed the regression failure, I'd been thinking that maybe it'd
> be better if the decision rule were not "at least 100+x% of the average
> frequency of this value and later ones", but "at least 100+x% of the
> average frequency of values after this one".

Hm, sounds pretty similar to what I wanted to achieve, but better
formalized.

> With that formulation, we're
> not constrained as to the range of x.  Now, if there are *no* values after
> this one, then this way needs an explicit special case in order not to
> compute 0/0; but the preceding para shows that we need a special case for
> the last value anyway.
>
> So, attached is a patch rewritten along these lines.  I used 50% rather
> than 25% as the new cutoff percentage --- obviously it should be higher
> in this formulation than before, but I have no idea if that particular
> number is good or we should use something else.  Also, the rule for the
> last value is "at least 1% of the non-null samples".  That's a pure guess
> as well.
>
> I do not have any good corpuses of data to try this on.  Can folks who
> have been following this thread try it on their data and see how it
> does?  Also please try some other multipliers besides 1.5, so we can
> get a feeling for where that cutoff should be placed.

Expect me to run it on my pet db early next week. :-)

Many thanks!
--
Alex

Re: [HACKERS] Batch update of indexes

Konstantin Knizhnik  writes:
> Attached please find patch for "ALTER INDEX ... WHERE ..." clause.
> It is now able to handle all three possible situations:
> 1. Making index partial (add WHERE condition to the ordinary index)
> 2. Extend partial index range (less restricted index predicate)
> 3. Arbitrary change of partial index predicate

I've not been following this thread previously, but this proposal
scares me quite a lot.  I am certain there are places in our code
that assume that the properties of an index don't change after it's
been created.  One area that this almost certainly breaks is HOT updates:
adding a previously-unindexed column to an index predicate might break
existing HOT chains, and I see nothing in this patch that could deal
with that.  I seem to recall there are other places that would be
broken by changing an index's DDL definition after creation, but can't
recall specifics right now.

I am also, frankly, not seeing a use-case for this functionality that
would justify trying to find and remove those assumptions.

There's a lot of things I don't care for about the way the patch is
written, in particular its willingness to use SPI (which opens a lot of
potential for search-path problems, failure to see uncommitted tuples,
etc).  But we need not get to that if we don't believe the functionality
can work.

> This patch includes src/bin/insbench utility for testing insert 
> performance. It can be easily excluded from the patch to reduce it size.

C++ is not likely to get accepted into our tree ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] raw output from copy

2016-04-02 Thread Andrew Dunstan




On 04/01/2016 11:42 AM, Daniel Verite wrote:

Andrew Dunstan wrote:


If someone can make a good case that this is going to be of
general use I'll happily go along, but I haven't seen one so far.

About COPY FROM with a raw format, for instance just yesterday
there was this user question on stackoverflow:
http://stackoverflow.com/questions/36317237

which essentially is: how to import contents from a file without any
particular interpretation of any character?\



There is so much wrong with this it's hard to know where to start.

Inserting the whole contents of a text file unchanged is insanely easy 
in psql.


\set file `cat /path/to/file`
insert into mytable(contents) values(:'file');

What is more everyone on SO missed the fact that CSV mode gives you very 
considerable control over the quote, delimiter and null settings.


See for example 
 which 
has this example for handling files consisting of 1 json document per line:


copy the_table(jsonfield)
from '/path/to/jsondata'
csv quote e'\x01' delimiter e'\x02';

psql's \copy will work just the same way

(I noticed with amusement this week that CitusData is using pretty much 
exactly this in one of their examples.)




With the patch discussed in this thread, a user can do
\copy table(textcol) from /path/to/file (format raw)
or the equivalent COPY.
If it's a binary column, that works just the same.



It would be fairly simple to invent a binary mechanism that did the 
equivalent of the above insert. All without any change to SQL or the 
backend at all.





Without this, it's not obvious at all how this result can be
achieved without resorting to external preprocessing,
and assuming the availability of such preprocessing tools
in the environment. Notwithstanding the fact that the
solution proposed on SO (doubling backslashes with sed)
doesn't even work if the file contains tabs, as they would be
interpreted as field separators, even if the copy target has only
one column. You can change the delimiter with COPY but AFAIK
you can't tell that there is none.



There is arguably a good case for allowing a null delimiter. But that SO 
page is just a terrible piece of misinformation, as far too often 
happens in my experience.


And I am still waiting for a non-psql use case. But I don't expect to 
see one, precisely because most clients have no difficulty at all in 
handling binary data.


cheers

andrew








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Transactional enum additions - was Re: [HACKERS] Alter or rename enum value

Andrew Dunstan  writes:
> Looking at this briefly. It looks like the check should be called from 
> enum_in() and enum_recv(). What error should be raised if the enum row's 
> xmin isn't committed? ERRCODE_FEATURE_NOT_SUPPORTED? or maybe 
> ERRCODE_DATA_EXCEPTION? I don't see anything that fits very well.

ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE is something we use in some
other places where the meaning is "just wait awhile, dude".  Or you
could invent a new ERRCODE.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

"Shulgin, Oleksandr"  writes:
> On Apr 1, 2016 23:14, "Tom Lane"  wrote:
>> Haven't looked at 0002 yet.

> [crosses fingers] hope you'll have a chance to do that before feature
> freeze for 9.6

I studied this patch for awhile after rebasing it onto yesterday's
commits.  I did not like the fact that the compute_scalar_stats logic
would allow absolutely anything into the MCV list once num_hist falls
below 2.  I think it's important that we continue to reject values that
are only seen once in the sample, because there's no very good reason to
think that they are MCVs and not just infrequent values that by luck
appeared in the sample.  However, after I rearranged the tests there so
that "if (num_hist >= 2)" only controlled whether to apply the 1/K limit,
one of the regression tests started to fail: there's a place in
rowsecurity.sql that expects that if a column contains nothing but several
instances of a single value, that value will be recorded as a lone MCV.
Now this isn't a particularly essential thing for that test, but it still
seems like a good property for ANALYZE to have.  The reason it's failing,
of course, is that the test as written cannot possibly accept the last
(or only) value.

Before I noticed the regression failure, I'd been thinking that maybe it'd
be better if the decision rule were not "at least 100+x% of the average
frequency of this value and later ones", but "at least 100+x% of the
average frequency of values after this one".  With that formulation, we're
not constrained as to the range of x.  Now, if there are *no* values after
this one, then this way needs an explicit special case in order not to
compute 0/0; but the preceding para shows that we need a special case for
the last value anyway.

So, attached is a patch rewritten along these lines.  I used 50% rather
than 25% as the new cutoff percentage --- obviously it should be higher
in this formulation than before, but I have no idea if that particular
number is good or we should use something else.  Also, the rule for the
last value is "at least 1% of the non-null samples".  That's a pure guess
as well.

I do not have any good corpuses of data to try this on.  Can folks who
have been following this thread try it on their data and see how it
does?  Also please try some other multipliers besides 1.5, so we can
get a feeling for where that cutoff should be placed.

regards, tom lane

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 44a4b3f..a2c606b 100644
*** a/src/backend/commands/analyze.c
--- b/src/backend/commands/analyze.c
*** compute_distinct_stats(VacAttrStatsP sta
*** 2120,2128 
  		 * we are able to generate a complete MCV list (all the values in the
  		 * sample will fit, and we think these are all the ones in the table),
  		 * then do so.  Otherwise, store only those values that are
! 		 * significantly more common than the (estimated) average. We set the
! 		 * threshold rather arbitrarily at 25% more than average, with at
! 		 * least 2 instances in the sample.
  		 */
  		if (track_cnt < track_max && toowide_cnt == 0 &&
  			stats->stadistinct > 0 &&
--- 2120,2138 
  		 * we are able to generate a complete MCV list (all the values in the
  		 * sample will fit, and we think these are all the ones in the table),
  		 * then do so.  Otherwise, store only those values that are
! 		 * significantly more common than the ones we omit.  We determine that
! 		 * by considering the values in frequency order, and accepting each
! 		 * one if it is at least 50% more common than the average among the
! 		 * values after it.  The 50% threshold is somewhat arbitrary.
! 		 *
! 		 * Note that the 50% rule will never accept a value with count 1,
! 		 * since all the values have count at least 1; this is a property we
! 		 * desire, since there's no very good reason to assume that a
! 		 * single-occurrence value is an MCV and not just a random non-MCV.
! 		 *
! 		 * We need a special rule for the very last value.  If we get to it,
! 		 * we'll accept it if it's at least 1% of the non-null samples and has
! 		 * count more than 1.
  		 */
  		if (track_cnt < track_max && toowide_cnt == 0 &&
  			stats->stadistinct > 0 &&
*** compute_distinct_stats(VacAttrStatsP sta
*** 2133,2153 
  		}
  		else
  		{
! 			/* d here is the same as d in the Haas-Stokes formula */
  			int			d = nonnull_cnt - summultiple + nmultiple;
! 			double		avgcount,
! 		mincount;

! 			/* estimate # occurrences in sample of a typical nonnull value */
! 			avgcount = (double) nonnull_cnt / (double) d;
! 			/* set minimum threshold count to store a value */
! 			mincount = avgcount * 1.25;
! 			if (mincount < 2)
! mincount = 2;
  			if (num_mcv > track_cnt)
  num_mcv = track_cnt;
  			for (i = 0; i < num_mcv; i++)
  			{
  if (track[i].count < mincount)
  {
  	num_mcv = i;
--- 2143,2181 
  		}

Transactional enum additions - was Re: [HACKERS] Alter or rename enum value

2016-04-02 Thread Andrew Dunstan




On 03/29/2016 04:56 PM, Andrew Dunstan wrote:



On 03/27/2016 10:20 AM, Tom Lane wrote:

Andrew Dunstan  writes:

The more I think about this the more I bump up against the fact that
almost anything we do might want to do to ameliorate the situation is
going to be rolled back. The only approach I can think of that doesn't
suffer from this is to abort if an insert/update will affect an 
index on

a modified enum. i.e. we prevent the possible corruption from happening
in the first place, as we do now, but in a much more fine grained way.

Perhaps, instead of forbidding ALTER ENUM ADD in a transaction, we could
allow that, but not allow the new value to be *used* until it's 
committed?
This could be checked cheaply during enum value lookup (ie, is xmin 
of the

pg_enum row committed).

What you really need is to prevent the new value from being inserted
into any indexes, but checking that directly seems far more difficult,
ugly, and expensive than the above.

I do not know whether this would be a meaningful improvement for
common use-cases, though.  (It'd help if people were more specific
about the use-cases they need to work.)





I think this is a pretty promising approach, certainly well worth 
putting some resources into investigating. One thing I like about it 
is that it gives a nice cheap negative test, so we know if the xmin is 
committed we are safe. So we could start by rejecting anything where 
it's not, but later might adopt a more refined but expensive tests for 
cases where it isn't committed without imposing a penalty on anything 
else.






Looking at this briefly. It looks like the check should be called from 
enum_in() and enum_recv(). What error should be raised if the enum row's 
xmin isn't committed? ERRCODE_FEATURE_NOT_SUPPORTED? or maybe 
ERRCODE_DATA_EXCEPTION? I don't see anything that fits very well.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small typo in a comment in pg_regress.c

2016-04-02 Thread Stephen Frost

Andreas,

* Andreas 'ads' Scherbaum (adsm...@wars-nicht.de) wrote:
> stumbled over this while looking into the source. Patch attached.

Fix pushed, thanks!

Stephen


signature.asc
Description: Digital signature

[HACKERS] Small typo in a comment in pg_regress.c

2016-04-02 Thread Andreas 'ads' Scherbaum



Hi,

stumbled over this while looking into the source. Patch attached.


Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 416829d..343fd19 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -1329,7 +1329,7 @@ results_differ(const char *testname, const char *resultsfile, const char *defaul
 	if (platform_expectfile)
 	{
 		/*
-		 * Replace everything afer the last slash in expectfile with what the
+		 * Replace everything after the last slash in expectfile with what the
 		 * platform_expectfile contains.
 		 */
 		char	   *p = strrchr(expectfile, '/');

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation