Re: [HACKERS] More stable query plans via more predictable column statistics

2016-01-23 Thread Tomas Vondra

Hi,

On 01/20/2016 10:49 PM, Alvaro Herrera wrote:

Tom Lane wrote:

"Shulgin, Oleksandr"  writes:

This post summarizes a few weeks of research of ANALYZE statistics
distribution on one of our bigger production databases with some real-world
data and proposes a patch to rectify some of the oddities observed.


Please add this to the 2016-01 commitfest ...


Tom, are you reviewing this for the current commitfest?


While I'm not the right Tom, I've been looking the the patch recently, 
so let me post the review here ...


Firstly, I'd like to appreciate the level of detail of the analysis. I 
may disagree with some of the conclusions, but I wish all my patch 
submissions were of such high quality.


Regarding the patch itself, I think there's a few different points 
raised, so let me discuss them one by one:



1) NULLs vs. MCV threshold
--

I agree that this seems like a bug, and that we should really compute 
the threshold only using non-NULL values. I think the analysis rather 
conclusively proves this, and I also think there are places where we do 
the same mistake (more on that at the end of the review).



2) mincount = 1.25 * avgcount
-

While I share the dislike of arbitrary constants (like the 1.25 here), I 
do think we better keep this, otherwise we can just get rid of the 
mincount entirely I think - the most frequent value will always be above 
the (remaining) average count, making the threshold pointless.


It might have impact in the original code, but in the new one it's quite 
useless (see the next point), unless I'm missing some detail.



3) modifying avgcount threshold inside the loop
---

The comment was extended with this statement:

 * We also decrease ndistinct in the process such that going forward
 * it refers to the number of distinct values left for the histogram.

and indeed, that's what's happening - at the end of each loop, we do this:

/* Narrow our view of samples left for the histogram */
sample_cnt -= track[i].count;
ndistinct--;

but that immediately lowers the avgcount, as that's re-evaluated within 
the same loop


avgcount = (double) sample_cnt / (double) ndistinct;

which means it's continuously decreasing and lowering the threshold, 
although that's partially mitigated by keeping the 1.25 coefficient.


It however makes reasoning about the algorithm much more complicated.


4) for (i = 0; /* i < num_mcv */; i++)
---

The way the loop is coded seems rather awkward, I guess. Not only 
there's an unexpected comment in the "for" clause, but the condition 
also says this


/* Another way to say "while (i < num_mcv)" */
if (i >= num_mcv)
break;

Why not to write it as a while loop, then? Possibly including the 
(num_hist >= 2) condition, like this:


while ((i < num_mcv) && (num_hist >= 2))
{
...
}

In any case, the loop would deserve a comment explaining why we think 
computing the thresholds like this makes sense.



Summary
---

Overall, I think this is really about deciding when to cut-off the MCV, 
so that it does not grow needlessly large - as Robert pointed out, the 
larger the list, the more expensive the estimation (and thus planning).


So even if we could fit the whole sample into the MCV list (i.e. we 
believe we've seen all the values and we can fit them into the MCV 
list), it may not make sense to do so. The ultimate goal is to estimate 
conditions, and if we can do that reasonably even after cutting of the 
least frequent values from the MCV list, then why not?


From this point of view, the analysis concentrates deals just with the 
ANALYZE part and does not discuss the estimation counter-part at all.



5) ndistinct estimation vs. NULLs
-

While looking at the patch, I started realizing whether we're actually 
handling NULLs correctly when estimating ndistinct. Because that part 
also uses samplerows directly and entirely ignores NULLs, as it does this:


numer = (double) samplerows *(double) d;

denom = (double) (samplerows - f1) +
(double) f1 *(double) samplerows / totalrows;

...
if (stadistinct > totalrows)
stadistinct = totalrows;

For tables with large fraction of NULLs, this seems to significantly 
underestimate the ndistinct value - for example consider a trivial table 
with 95% of NULL values and ~10k distinct values with skewed distribution:


create table t (id int);

insert into t
select (case when random() < 0.05 then (1 * random() * random())
 else null end) from generate_series(1,100) s(i);

In practice, there are 8325 distinct values in my sample:

test=# select count(distinct id) from t;
 count
---
  8325
(1 row)

But after ANALYZE with default statistics target (100), ndistinct is 
estimated to be 

Re: [HACKERS] insert/update performance

2016-01-23 Thread Amit Kapila
On Sat, Jan 23, 2016 at 1:16 PM, Jinhua Luo  wrote:
>
> Hi,
>
> The vacuum doesn't recycle the rows obsoleted by update?
>

It does free up the space which can be used by future inserts.

> I don't think
> so. In the above vacuum result, I do not delete any rows, but the
> vacuum still recycles a fraction of rows, obviously they're obsoleted
> by update.
>
> I know plain vacuum (without full option) do not reduce the size of
> the whole table file/segments, but it should refresh the fsm. In my
> case, the update statement did modify the index column, but is it
> related to the fsm? I think anyways, the update would obsolete
> previous versions, as long as they are not hold by any active
> transactions, they would be recycled and count in the fsm, right?

I also think it will be added to fsm.

>  I
> just cannot understand why the recycle ratio is not 50%.
>

At the moment, I am also not able to see why it is so.  You might
want to first try with a simple test (Can you extract insert/update
statements from application and run it manually for couple of times
and then run Vacuum to see the result).

By anychance have you set a value for vacuum_defer_cleanup_age?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-23 Thread David Rowley
On 23 January 2016 at 12:44, Tom Lane  wrote:
> I spent some time looking through this but decided that it needs some work
> to be committable.
>
> The main thing I didn't like was that identify_key_vars seems to have a
> rather poorly chosen API: it mixes up identifying a rel's pkey with
> sorting through the GROUP BY list.  I think it would be better to create
> a function that, given a relid, hands back the OID of the pkey constraint
> and a Bitmapset of the column numbers in the pkey (or 0/NULL if there's
> no pkey or it's deferrable).  The column numbers should be offset by
> FirstLowInvalidHeapAttributeNumber so that a pkey on OID can be
> represented correctly --- compare RelationGetIndexAttrBitmap().

That seems like a much better design to me too. I've made changes and
attached the updated patch.

> The reason this seems like a more attractive solution is that the output
> of the pg_constraint lookup becomes independent of the particular query
> and thus is potentially cacheable (in the relcache, say).  I do not think
> we need to cache it right now but I'd like to have the option to do so.

I've not touched that yet, but good idea.

> (I wonder BTW whether check_functional_grouping couldn't be refactored
> to use the output of such a function, too.)

I've attached a separate patch for that too. It applies after the
prunegroupby patch.

> Some lesser points:
>
> * I did not like where you plugged in the call to
> remove_useless_groupby_columns; there are weird interactions with grouping
> sets as to whether it will get called or not, and also that whole chunk
> of code is due for refactoring.  I'd be inclined to call it from
> subquery_planner instead, maybe near where the HAVING clause preprocessing
> happens.

I've moved the call to subquery_planner()

> * You've got it iterating over every RTE, even those that aren't
> RTE_RELATION RTEs.  It's pure luck that identify_key_vars doesn't fail
> outright when passed a bogus relid, and it's certainly wasteful.

:-( I've fixed that now.

> * Both of the loops iterating over the groupClause neglect to check
> varlevelsup, thus leading to assert failures or worse if an outer-level
> Var is present in the GROUP BY list.  (I'm pretty sure outer Vars can
> still be present at this point, though I might be wrong.)

Fixed in the first loop, and the way I've rewritten the code to use
bms_difference, there's no need to check again in the 2nd loop.

> * I'd be inclined to see if the relvarlists couldn't be converted into
> bitmapsets too.  Then the test to see if the pkey is a subset of the
> grouping columns becomes a simple and cheap bms_is_subset test.  You don't
> really need surplusvars either, because you can instead test Vars for
> membership in the pkey bitmapsets that pg_constraint.c will be handing
> back.

I've changed to use a bitmapset now, but I've kept surplusvars, having
this allows a single pass over the group by clause to remove the
surplus Vars, rather than doing it again and again for each relation.
I've also found a better way so that array is only allocated if some
surplus Vars are found. I understand what you mean, and yes, it is
possible to get rid of it, but I'd need to still add something else to
mark that this rel's extra Vars are okay to be removed. I could do
that by adding another bitmapset and marking the relid in that, but I
quite like how I've changed it now as it saves having to check
varlevelsup again on the Vars in the GROUP BY clause, as I just use
bms_difference() on the originally found Vars subtracting off the PK
vars, and assume all of those are surplus.

> * What you did to join.sql/join.out seems a bit bizarre.  The existing
> test case that you modified was meant to test something else, and
> conflating this behavior with the pre-existing one (and not documenting
> it) is confusing as can be.  A bit more work on regression test cases
> seems indicated.

The history behind that is that at one point during developing the
patch that test had started failing due to the group by item being
removed therefore allowing the join removal conditions to be met. On
testing again with the old test query I see this no longer happens, so
I've removed the change, although the expected output still differs
due to the group by item being removed.

> I'm going to set this back to "Waiting on Author".  I think the basic
> idea is good but it needs a rewrite.

Thanks for the review and the good ideas.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


prune_group_by_clause_ab4f491_2016-01-23.patch
Description: Binary data


check_functional_grouping_refactor.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-23 Thread David Rowley
On 24 January 2016 at 00:56, Julien Rouhaud  wrote:
> I wonder if in remove_useless_groupby_columns(), in the foreach loop you
> could change the
>
> +   if (bms_subset_compare(pkattnos, relattnos) == BMS_SUBSET1)
> +   {
>
> by something like
>
>
> +   if (bms_num_members(relattnos) <= bms_num_members(pkattnos))
> +   continue;
> +
> +   if (bms_is_subset(pkattnos, relattnos))
> +   {
>
>
> which may be cheaper.

Thanks for looking over this again.

I actually had that part written quite a few different ways before I
finally decided to use bms_subset_compare. I didn't benchmark, but I
thought 1 function call was better than 2, as I had it as
bms_is_subset(pkattnos, relattnos) && !bms_is_subset(relattnos,
pkattnos), and again with !bms_equal() instead of the 2nd subset test.
I figured 1 function call was better than 2, so finally settled on
bms_subset_compare(). I'm thinking that 3 function calls likely won't
make things better.  I can't imagine it's going to cost much either
way, so I doubt it's worth trying to check whats faster. Although the
thing about bms_num_members() is that it's going to loop over each
word in the bitmap no matter what, whereas a subset check can abort
early in some cases.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-23 Thread Julien Rouhaud
On 23/01/2016 10:44, David Rowley wrote:
> On 23 January 2016 at 12:44, Tom Lane  wrote:
>> I spent some time looking through this but decided that it needs some work
>> to be committable.
>>
>> The main thing I didn't like was that identify_key_vars seems to have a
>> rather poorly chosen API: it mixes up identifying a rel's pkey with
>> sorting through the GROUP BY list.  I think it would be better to create
>> a function that, given a relid, hands back the OID of the pkey constraint
>> and a Bitmapset of the column numbers in the pkey (or 0/NULL if there's
>> no pkey or it's deferrable).  The column numbers should be offset by
>> FirstLowInvalidHeapAttributeNumber so that a pkey on OID can be
>> represented correctly --- compare RelationGetIndexAttrBitmap().
> 
> That seems like a much better design to me too. I've made changes and
> attached the updated patch.
> 
>> The reason this seems like a more attractive solution is that the output
>> of the pg_constraint lookup becomes independent of the particular query
>> and thus is potentially cacheable (in the relcache, say).  I do not think
>> we need to cache it right now but I'd like to have the option to do so.
> 
> I've not touched that yet, but good idea.
> 
>> (I wonder BTW whether check_functional_grouping couldn't be refactored
>> to use the output of such a function, too.)
> 
> I've attached a separate patch for that too. It applies after the
> prunegroupby patch.
> 

This refactoring makes the code much more better, +1 for me.

I also reviewed it, it looks fine. However, the following removed
comment could be kept for clarity:

-   /* The PK is a subset of grouping_columns, so we win */



>> Some lesser points:
>>
>> * I did not like where you plugged in the call to
>> remove_useless_groupby_columns; there are weird interactions with grouping
>> sets as to whether it will get called or not, and also that whole chunk
>> of code is due for refactoring.  I'd be inclined to call it from
>> subquery_planner instead, maybe near where the HAVING clause preprocessing
>> happens.
> 
> I've moved the call to subquery_planner()
> 
>> * You've got it iterating over every RTE, even those that aren't
>> RTE_RELATION RTEs.  It's pure luck that identify_key_vars doesn't fail
>> outright when passed a bogus relid, and it's certainly wasteful.
> 
> :-( I've fixed that now.
> 
>> * Both of the loops iterating over the groupClause neglect to check
>> varlevelsup, thus leading to assert failures or worse if an outer-level
>> Var is present in the GROUP BY list.  (I'm pretty sure outer Vars can
>> still be present at this point, though I might be wrong.)
> 
> Fixed in the first loop, and the way I've rewritten the code to use
> bms_difference, there's no need to check again in the 2nd loop.
> 
>> * I'd be inclined to see if the relvarlists couldn't be converted into
>> bitmapsets too.  Then the test to see if the pkey is a subset of the
>> grouping columns becomes a simple and cheap bms_is_subset test.  You don't
>> really need surplusvars either, because you can instead test Vars for
>> membership in the pkey bitmapsets that pg_constraint.c will be handing
>> back.
> 
> I've changed to use a bitmapset now, but I've kept surplusvars, having
> this allows a single pass over the group by clause to remove the
> surplus Vars, rather than doing it again and again for each relation.
> I've also found a better way so that array is only allocated if some
> surplus Vars are found. I understand what you mean, and yes, it is
> possible to get rid of it, but I'd need to still add something else to
> mark that this rel's extra Vars are okay to be removed. I could do
> that by adding another bitmapset and marking the relid in that, but I
> quite like how I've changed it now as it saves having to check
> varlevelsup again on the Vars in the GROUP BY clause, as I just use
> bms_difference() on the originally found Vars subtracting off the PK
> vars, and assume all of those are surplus.
> 

I wonder if in remove_useless_groupby_columns(), in the foreach loop you
could change the

+   if (bms_subset_compare(pkattnos, relattnos) == BMS_SUBSET1)
+   {

by something like


+   if (bms_num_members(relattnos) <= bms_num_members(pkattnos))
+   continue;
+
+   if (bms_is_subset(pkattnos, relattnos))
+   {


which may be cheaper.

Otherwise, I couldn't find any issue with this new version of the patch.

>> * What you did to join.sql/join.out seems a bit bizarre.  The existing
>> test case that you modified was meant to test something else, and
>> conflating this behavior with the pre-existing one (and not documenting
>> it) is confusing as can be.  A bit more work on regression test cases
>> seems indicated.
> 
> The history behind that is that at one point during developing the
> patch that test had started failing due to the group by item being
> removed therefore allowing the join removal conditions to be met. On
> testing again with 

Re: [HACKERS] insert/update performance

2016-01-23 Thread Jinhua Luo
Hi,

2016-01-23 18:40 GMT+08:00 Amit Kapila :
> At the moment, I am also not able to see why it is so.  You might
> want to first try with a simple test (Can you extract insert/update
> statements from application and run it manually for couple of times
> and then run Vacuum to see the result).

I try to do it manually, the issue is the same. It's weird that for
the index, the number of removed rows is correct. Just the table
itself is wrong (Sometimes it's correct too, it seems that it's a
random issue, I'm so confused).

>
> By anychance have you set a value for vacuum_defer_cleanup_age?
>

No, I do not configure it.

Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script & other stuff

2016-01-23 Thread Fabien COELHO


Hello Alvaro,


I'm looking at this part of your patch and I think it's far too big to
be a simple refactoring. Would you split it up please?


You know how delighted I am to split patches...

Here is a 5 part ordered patch serie:

a) add -b option for cumulating builtins and rework internal script
   management so that builtin and external scripts are managed the
   same way.

b) refactor statistics collections (per thread, per command, per whatever)
   so as to use the same structure everywhere, reducing the CLOC by 115.
   this enables the next small patch which can reuse the new functions.

c) add per-script statistics... because Josh asked:-)

d) add optional weight to control the relative frequency of scripts.

e) minor code cleanup :
   use bool instead of int where appropriate
   put together struct fields when they belong together
   move 2 options at their right position in the list

This patch serie conflicts slightly with the "add functions to pgbench" 
patch which is marked as ready in the CF. The first to make it will mean 
some conflict resolution for the other. Maybe I would prefer this one 
serie to go first, if I had any say...


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 541d17b..fdd3331 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,23 @@ pgbench  options  dbname
 benchmarking arguments:
 
 
+ 
+  -b scriptname
+  --builtin scriptname
+  
+   
+Add the specified builtin script to the list of executed scripts.
+Available builtin scripts are: tpcb-like,
+simple-update and select-only.
+The provided scriptname needs only to be a prefix
+of the builtin name, hence simp would be enough to select
+simple-update.
+With special name list, show the list of builtin scripts
+and exit immediately.
+   
+  
+ 
+
 
  
   -c clients
@@ -307,14 +324,13 @@ pgbench  options  dbname
  
 
  
-  -f filename
-  --file=filename
+  -f filename
+  --file=filename
   

-Read transaction script from filename.
+Add a transaction script read from filename to
+the list of executed scripts.
 See below for details.
--N, -S, and -f
-are mutually exclusive.

   
  
@@ -404,10 +420,8 @@ pgbench  options  dbname
   --skip-some-updates
   

-Do not update pgbench_tellers and
-pgbench_branches.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Run builtin simple-update script.
+Shorthand for -b simple-update.

   
  
@@ -512,9 +526,9 @@ pgbench  options  dbname
 Report the specified scale factor in pgbench's
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the pgbench_branches table.  However, when testing
-custom benchmarks (-f option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the pgbench_branches table.
+However, when testing only custom benchmarks (-f option),
+the scale factor will be reported as 1 unless this option is used.

   
  
@@ -524,7 +538,8 @@ pgbench  options  dbname
   --select-only
   

-Perform select-only transactions instead of TPC-B-like test.
+Run built-in select-only script.
+Shorthand for -b select-only.

   
  
@@ -674,7 +689,17 @@ pgbench  options  dbname
   What is the Transaction Actually Performed in pgbench?
 
   
-   The default transaction script issues seven commands per transaction:
+   Pgbench executes test scripts chosen randomly from a specified list.
+   They include built-in scripts with -b and
+   user-provided custom scripts with -f.
+ 
+
+  
+   The default builtin transaction script (also invoked with -b tpcb-like)
+   issues seven commands per transaction over randomly chosen aid,
+   tid, bid and balance.
+   The scenario is inspired by the TPC-B benchmark, but is not actually TPC-B,
+   hence the name.
   
 
   
@@ -688,9 +713,15 @@ pgbench  options  dbname
   
 
   
-   If you specify -N, steps 4 and 5 aren't included in the
-   transaction.  If you specify -S, only the SELECT is
-   issued.
+   If you select the simple-update builtin (also -N),
+   steps 4 and 5 aren't included in the transaction.
+   This will avoid update contention on these tables, but
+   it makes the test case even less like TPC-B.
+  
+
+  
+   If you select the select-only builtin (also -S),
+   only the SELECT is issued.
   
  
 
@@ -702,10 +733,7 @@ pgbench  options  dbname
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
 

Re: [HACKERS] Relation extension scalability

2016-01-23 Thread Amit Kapila
On Sat, Jan 23, 2016 at 12:19 PM, Amit Kapila 
wrote:

> On Tue, Jan 12, 2016 at 2:41 PM, Dilip Kumar 
> wrote:
>
>> On Thu, Jan 7, 2016 at 4:53 PM, Andres Freund  wrote:
>>
>>> On 2016-01-07 16:48:53 +0530, Amit Kapila wrote:
>>>
>>> I think it's a worthwhile approach to pursue. But until it actually
>>> fixes the problem of leaving around uninitialized pages I don't think
>>> it's very meaningful to do performance comparisons.
>>>
>>
>> Attached patch solves this issue, I am allocating the buffer for each
>> page and initializing the page, only after that adding to FSM.
>>
>
> Few comments about patch:
>
>
I found one more problem with patch.

! UnlockReleaseBuffer(buffer);
! RecordPageWithFreeSpace(relation, BufferGetBlockNumber(buffer),
freespace);

You can't call BufferGetBlockNumber(buffer) after releasing
the pin on buffer which will be released by
UnlockReleaseBuffer().  Get the block number before unlocking
the buffer.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] PoC: Partial sort

2016-01-23 Thread Tomas Vondra

Hi,

On 10/20/2015 01:17 PM, Alexander Korotkov wrote:

On Fri, Oct 16, 2015 at 7:11 PM, Alexander Korotkov
> wrote:

On Sun, Jun 7, 2015 at 11:01 PM, Peter Geoghegan > wrote:

On Sun, Jun 7, 2015 at 8:10 AM, Andreas Karlsson
> wrote:
> Are you planning to work on this patch for 9.6?

FWIW I hope so. It's a nice patch.


I'm trying to to whisk dust. Rebased version of patch is attached.
This patch isn't passing regression tests because of plan changes.
I'm not yet sure about those changes: why they happens and are they
really regression?
Since I'm not very familiar with planning of INSERT ON CONFLICT and
RLS, any help is appreciated.


Planner regression is fixed in the attached version of patch. It appears
that get_cheapest_fractional_path_for_pathkeys() behaved wrong when no
ordering is required.



Alexander, are you working on this patch? I'd like to look at the patch, 
but the last available version (v4) no longer applies - there's plenty 
of bitrot. Do you plan to send an updated / rebased version?



The main thing I'm particularly interested in is how much is this 
coupled with the Sort node, and whether it's possible to feed partially 
sorted tuples into other nodes.


I'm particularly thinking about Hash Aggregate, because the partial sort 
allows to keep only the "current group" in a hash table, making it much 
more memory efficient / faster. What do you think?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improved tab completion for FDW DDL

2016-01-23 Thread Peter Eisentraut
On 1/18/16 8:36 PM, Andreas Karlsson wrote:
> On 01/11/2016 02:39 AM, Peter Eisentraut wrote:
>> The second part is not necessary, because there is already code that
>> completes FDW names after "FOREIGN DATA WRAPPER".  So this already works.
> 
> Good spot, thanks. I have no idea why I did not think it worked. Maybe
> it just did not work in 9.2 and I failed when trying to reproduce it on
> master.
> 
>> - Also complete RENAME TO in ALTER FOREIGN DATA WRAPPER.
> 
> Done.
> 
>> - Also complete OPTIONS in FOREIGN DATA WRAPPER and SERVER commands.
> 
> Done.

committed



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Tom Lane
Noah Misch  writes:
> On Sat, Jan 23, 2016 at 12:08:40PM -0500, Tom Lane wrote:
>> So I pushed that, and tern/sungazer are still failing.  Noah, could you
>> trace through that and see exactly where it's going off the rails?

> The second sin() is a constant, so gcc computes it immediately but sends the
> first sin() to libm.  The libm sin() is slightly more accurate.

Ugh.  "compile-time simplification uses different version of sin()" was
one of the theories I had in mind, but I was hoping that wasn't it
because it'd be the hardest to work around reliably.  Still, I think it's
doable by caching the results of the should-be-constant subexpressions.
Will get on it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Tom Lane
Jinhua Luo  writes:
> I have a table with 70 columns, and 6 indexes. The data flow is a
> special OLTP model: frequent inserts (2000 tps), and each inserted row
> would be updated very soon (i.e. the number of inserts is equal to the
> number of updates).

Do those predictable updates change any of the indexed columns?

> I do a simple test: I truncate the table, disable the autovacuum, and
> run the application for a few minutes, then I invokes vacuum manually,
> it gives a strange output:
> found 598 removable, 25662 nonremovable row versions in 3476 pages
> DETAIL:  0 dead row versions cannot be removed yet
> As said before, the number of inserts is equal to the number of
> updates. So the bloat of the table should be 100%, and the number of
> removable rows should be equal to the number of nonremovable rows,
> which is the real number of inserts issued by the application.

What seems likely is that most of the updates are HOT (because they
don't change any indexed columns) and then the freed space is reclaimable
by subsequent updates on the same page without needing a VACUUM.

Watching the insert/update/hot-update counts in pg_stat_all_tables would
provide some evidence.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Noah Misch
On Sat, Jan 23, 2016 at 12:08:40PM -0500, Tom Lane wrote:
> I wrote:
> > So the early returns from the buildfarm aren't very good:
> > * tern/sungazer isn't getting exactly 0.5 from sind(30).
> 
> > The tern/sungazer result implies that this:
> 
> > return (sin(x * (M_PI / 180.0)) / sin(30.0 * (M_PI / 180.0))) / 2.0;
> 
> > is not producing exactly 0.5, which means that the two sin() calls aren't
> > producing identical results, which I suspect is best explained by the
> > theory that the compiler is rearranging 30.0 * (M_PI / 180.0) into
> > (30.0 * M_PI) / 180.0, and getting a slightly different number that way.
> 
> > I think we could fix that by replacing (M_PI / 180.0) by a hard-wired
> > constant (computed to say 20 digits or so).
> 
> So I pushed that, and tern/sungazer are still failing.  Noah, could you
> trace through that and see exactly where it's going off the rails?

The second sin() is a constant, so gcc computes it immediately but sends the
first sin() to libm.  The libm sin() is slightly more accurate.  In %a
notation, AIX libm computes sin(30.0 * RADIANS_PER_DEGREE) as 0x1p-1 while gcc
computes it as 0x1.fp-2, a difference of one ULP.  (Both "30.0 *
RADIANS_PER_DEGREE" and "30.0 * (M_PI / 180.0)" match the runtime computation
of 0x1.0c152382d7365p-1.)

To reliably produce exact answers, this code must delay all trigonometric
calculations to runtime.  On sungazer, the float8 test happens to pass if I
rebuild float.c with -fno-builtin-sin; that leaves calls like acos(0.5) and
cos(60.0 * RADIANS_PER_DEGREE) unprotected, though.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] silent data loss with ext4 / all current versions

2016-01-23 Thread Michael Paquier
On Sat, Jan 23, 2016 at 11:39 AM, Tomas Vondra
 wrote:
> On 01/23/2016 02:35 AM, Michael Paquier wrote:
>>
>> On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark  wrote:
>>> On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra
>>>  wrote:
>>> LVM snapshots would have the advantage that you can keep running the
>>> database and you can take lots of snapshots with relatively little
>>> overhead. Having dozens or hundreds of snapshots would be unacceptable
>>> performance drain in production but for testing it should be practical
>>> and they take relatively little space -- just the blocks changed since
>>> the snapshot was taken.
>>
>>
>> Another idea: hardcode a PANIC just after rename() with
>> restart_after_crash = off (this needs is IsBootstrapProcess() checks).
>> Once server crashes, kill-9 the VM. Then restart the VM and the
>> Postgres instance with a new binary that does not have the PANIC, and
>> see how things are moving on. There is a window of up to several
>> seconds after the rename() call, so I guess that this would work.
>
>
> I don't see how that would improve anything, as the PANIC has no impact on
> the I/O requests already issued to the system. What you need is some sort of
> coordination between the database and the script that kills the VM (or takes
> a LVM snapshot).

Well, to emulate the noise that non-renamed files have on the system
we could simply emulate the loss of rename() by just commenting it out
and then forcibly crash the instance or just PANIC the instance just
before rename(). This would emulate what we are looking for, no? What
we want to check is how the system reacts should an unwanted file be
in place.
For example, take the rename() call in InstallXLogFileSegment(),
crashing with an non-effective rename() will cause the presence of an
annoying xlogtemp file. Making the rename persistent would make the
server complain about an invalid magic number in a segment that has
just been created.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-23 Thread Julien Rouhaud
On 23/01/2016 14:51, David Rowley wrote:
> On 24 January 2016 at 00:56, Julien Rouhaud  wrote:
>> I wonder if in remove_useless_groupby_columns(), in the foreach loop you
>> could change the
>>
>> +   if (bms_subset_compare(pkattnos, relattnos) == BMS_SUBSET1)
>> +   {
>>
>> by something like
>>
>>
>> +   if (bms_num_members(relattnos) <= bms_num_members(pkattnos))
>> +   continue;
>> +
>> +   if (bms_is_subset(pkattnos, relattnos))
>> +   {
>>
>>
>> which may be cheaper.
> 
> Thanks for looking over this again.
> 
> I actually had that part written quite a few different ways before I
> finally decided to use bms_subset_compare. I didn't benchmark, but I
> thought 1 function call was better than 2, as I had it as
> bms_is_subset(pkattnos, relattnos) && !bms_is_subset(relattnos,
> pkattnos), and again with !bms_equal() instead of the 2nd subset test.
> I figured 1 function call was better than 2, so finally settled on
> bms_subset_compare(). I'm thinking that 3 function calls likely won't
> make things better.  I can't imagine it's going to cost much either
> way, so I doubt it's worth trying to check whats faster. Although the
> thing about bms_num_members() is that it's going to loop over each
> word in the bitmap no matter what, whereas a subset check can abort
> early in some cases.
> 
> 

FWIW, this code was simplified example.  bms_num_members(relattnos) is
already called a few lines before, so it'd be 1 function call against 2
function calls (and a var assignment).

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-23 Thread Tom Lane
David Rowley  writes:
> On 23 January 2016 at 12:44, Tom Lane  wrote:
>> * What you did to join.sql/join.out seems a bit bizarre.  The existing
>> test case that you modified was meant to test something else, and
>> conflating this behavior with the pre-existing one (and not documenting
>> it) is confusing as can be.  A bit more work on regression test cases
>> seems indicated.

> The history behind that is that at one point during developing the
> patch that test had started failing due to the group by item being
> removed therefore allowing the join removal conditions to be met. On
> testing again with the old test query I see this no longer happens, so
> I've removed the change, although the expected output still differs
> due to the group by item being removed.

Hmm ... but ... it seems to me that the test as it stands *should* fail
after this patch, because once the non-pkey grouping column is removed
the join removal optimization should apply.  I think we should look a bit
more closely at what's happening there.

(IOW, I wasn't so much unhappy with the change to that test case as
that it was being used as the only test case for this new behavior.
I see you added some new, separate test cases, so that's good; but
there's something fishy if the existing case doesn't change behavior.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Tom Lane
Peter Eisentraut  writes:
> On 1/23/16 12:08 PM, Tom Lane wrote:
>> So I pushed that, and tern/sungazer are still failing.  Noah, could you
>> trace through that and see exactly where it's going off the rails?

> I'm still getting a failure in float8 on OS X after commit 73193d8:

Weird, because my OS X critters are all happy.  Which OS X and compiler
version, exactly?  Any special compile flags?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: function parse_ident

2016-01-23 Thread Tom Lane
Michael Paquier  writes:
> On Sat, Jan 23, 2016 at 1:25 AM, Marko Tiikkaja  wrote:
> +  errmsg("identifier contains disallowed chars"),
> +  errdetail("string \"%s\" is not valid identifier",
> + text_to_cstring(qualname;
> Perhaps, "identifier contains not allowed character" is better?

"disallowed" reads better to me.  I agree with expanding "chars" to
"characters" though.  Also, the errdetail is conveying no actual extra
detail AFAICS.  I'd go with something like

errmsg("identifier contains disallowed characters: \"%s\"",
   text_to_cstring(qualname)));

regards, tom lane








The errdeta

regards, tom lane




> -- 
> Michael


> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Tom Lane
I wrote:
> So the early returns from the buildfarm aren't very good:
> * tern/sungazer isn't getting exactly 0.5 from sind(30).

> The tern/sungazer result implies that this:

>   return (sin(x * (M_PI / 180.0)) / sin(30.0 * (M_PI / 180.0))) / 2.0;

> is not producing exactly 0.5, which means that the two sin() calls aren't
> producing identical results, which I suspect is best explained by the
> theory that the compiler is rearranging 30.0 * (M_PI / 180.0) into
> (30.0 * M_PI) / 180.0, and getting a slightly different number that way.

> I think we could fix that by replacing (M_PI / 180.0) by a hard-wired
> constant (computed to say 20 digits or so).

So I pushed that, and tern/sungazer are still failing.  Noah, could you
trace through that and see exactly where it's going off the rails?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Peter Eisentraut
On 1/23/16 12:08 PM, Tom Lane wrote:
> So I pushed that, and tern/sungazer are still failing.  Noah, could you
> trace through that and see exactly where it's going off the rails?

I'm still getting a failure in float8 on OS X after commit 73193d8:

@@ -490,9 +490,9 @@
   x   | asind | acosd | atand
 --+---+---+---
-1 |   -90 |   180 |   -45
- -0.5 |   -30 |   120 |
+ -0.5 |   |   120 |
 0 | 0 |90 | 0
-  0.5 |30 |60 |
+  0.5 |   |   |
 1 |90 | 0 |45



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: function parse_ident

2016-01-23 Thread Michael Paquier
On Sat, Jan 23, 2016 at 1:25 AM, Marko Tiikkaja  wrote:
> Hi Pavel,
>
> Sorry for the lack of review here.  I didn't realize I was still the
> reviewer for this after it had been moved to another commitfest.
>
> That said, I think I've exhausted my usefulness here as a reviewer. Marking
> ready for committer.

+  errmsg("identifier contains disallowed chars"),
+  errdetail("string \"%s\" is not valid identifier",
+ text_to_cstring(qualname;
Perhaps, "identifier contains not allowed character" is better?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-01-23 Thread Tom Lane
Julien Rouhaud  writes:
> I wonder if in remove_useless_groupby_columns(), in the foreach loop you
> could change the

> +   if (bms_subset_compare(pkattnos, relattnos) == BMS_SUBSET1)
> +   {

> by something like

> +   if (bms_num_members(relattnos) <= bms_num_members(pkattnos))
> +   continue;
> +
> +   if (bms_is_subset(pkattnos, relattnos))
> +   {

> which may be cheaper.

FWIW, I really doubt that would be cheaper.  The various flavors of
subset comparison are word-at-a-time bitmasking operations, but
bms_num_members has to grovel over individual bits; it's certain to
be more expensive than a subset test.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2016-01-23 Thread Peter Geoghegan
On Sat, Jan 23, 2016 at 4:07 AM, Tomas Vondra
 wrote:
> The main thing I'm particularly interested in is how much is this coupled
> with the Sort node, and whether it's possible to feed partially sorted
> tuples into other nodes.

That's cool, but I'm particularly interested in seeing Alexander get
back to this because it's an important project on its own. We should
really have this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Peter Eisentraut
On 1/23/16 3:05 PM, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 1/23/16 12:08 PM, Tom Lane wrote:
>>> So I pushed that, and tern/sungazer are still failing.  Noah, could you
>>> trace through that and see exactly where it's going off the rails?
> 
>> I'm still getting a failure in float8 on OS X after commit 73193d8:
> 
> Weird, because my OS X critters are all happy.  Which OS X and compiler
> version, exactly?  Any special compile flags?

I'm using gcc 4.8.  It passes with the system clang.  So the explanation
is probably along the lines of what Noah has described.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Tom Lane
Peter Eisentraut  writes:
> On 1/23/16 3:05 PM, Tom Lane wrote:
>> Peter Eisentraut  writes:
>>> I'm still getting a failure in float8 on OS X after commit 73193d8:

>> Weird, because my OS X critters are all happy.  Which OS X and compiler
>> version, exactly?  Any special compile flags?

> I'm using gcc 4.8.  It passes with the system clang.  So the explanation
> is probably along the lines of what Noah has described.

Ah.  Please see if what I just pushed fixes it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Combining Aggregates

2016-01-23 Thread Jeff Janes
On Fri, Jan 22, 2016 at 4:53 PM, David Rowley
 wrote:
>
> It seems that I must have mistakenly believed that non-existing
> columns for previous versions were handled using the power of magic.
> Turns out that I was wrong, and they need to be included as dummy
> columns in the queries for previous versions.
>
> The attached fixes the problem.

Yep, that fixes it.

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Noah Misch
On Sat, Jan 23, 2016 at 05:04:56PM -0500, Tom Lane wrote:
> Noah Misch  writes:
> > To reliably produce exact answers, this code must delay all trigonometric
> > calculations to runtime.  On sungazer, the float8 test happens to pass if I
> > rebuild float.c with -fno-builtin-sin; that leaves calls like acos(0.5) and
> > cos(60.0 * RADIANS_PER_DEGREE) unprotected, though.
> 
> Either I missed something or there's another issue, because tern/sungazer
> are *still* failing.  This is getting annoying :-(

sungazer's "make check" passes if I change init_degree_constants() to be
non-static.  Duping gcc isn't so easy these days.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Tom Lane
Noah Misch  writes:
> To reliably produce exact answers, this code must delay all trigonometric
> calculations to runtime.  On sungazer, the float8 test happens to pass if I
> rebuild float.c with -fno-builtin-sin; that leaves calls like acos(0.5) and
> cos(60.0 * RADIANS_PER_DEGREE) unprotected, though.

Either I missed something or there's another issue, because tern/sungazer
are *still* failing.  This is getting annoying :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: ResourceOwner optimization for tables with many partitions

2016-01-23 Thread Tom Lane
Aleksander Alekseev  writes:
> Oops, wrong patches - here are correct ones.

This is certainly not doing what I had in mind for communication
between ResourceOwnerGetAny and ResourceOwnerRelease, because the
latter pays zero attention to "lastidx", but instead does a blind
hash search regardless.

In general, I'm suspicious of the impact of this patch on "normal"
cases with not-very-large resource arrays.  It might be hard to
measure that because in such cases resowner.c is not a bottleneck;
but that doesn't mean I want to give up performance in cases that
perform well today.  You could probably set up a test harness that
exercises ResourceOwnerAdd/Release/etc in isolation and get good
timings for them that way, to confirm or disprove whether there's
a performance loss for small arrays.

I still suspect it would be advantageous to have two different operating
modes depending on the size of an individual resource array, and not
bother with hashing until you got to more than a couple dozen entries.
Given that you're rehashing anyway when you enlarge the array, this
wouldn't cost anything except a few more lines of code, ISTM --- and
you already have a net code savings because of the refactoring.

Also, there are defined ways to convert between Datum format and
other formats (DatumGetPointer() etc).  This code isn't using 'em.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump fails on domain constraint comments

2016-01-23 Thread Elvis Pranskevichus
On January 22, 2016 08:09:36 PM Alvaro Herrera wrote:
> Michael Paquier wrote:
> > On Tue, Jan 12, 2016 at 7:56 AM, Elvis Pranskevichus  
wrote:
> > > It looks like pg_dump emits incorrect text for domain constraint
> > > comments:
> > > 
> > > Assuming the following structure,
> > 
> > Nice catch! qtypname already has fmtId applied to it, so quotes are
> > applied twice to it in this case. I am adding an entry in the next CF
> > with patch marked as ready for committer so as this does not fail into
> > oblivion. A backpatch would be welcome as well.
> 
> Yeah, evidently I didn't test that part.  Thanks Elvis, applied and
> backpatched to 9.5.
> 
> ... Oh, I just noticed you requested this feature back in 2013,
> https://www.postgresql.org/message-id/5310157.yWWCtg2qIU%40klinga.prans.org
> I hope that it's doing what you wanted!

Yes, works like a charm!  Thanks for implementing this.


Elvis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2016-01-23 Thread Tom Lane
Noah Misch  writes:
> On Sat, Jan 23, 2016 at 05:04:56PM -0500, Tom Lane wrote:
>> Either I missed something or there's another issue, because tern/sungazer
>> are *still* failing.  This is getting annoying :-(

> sungazer's "make check" passes if I change init_degree_constants() to be
> non-static.  Duping gcc isn't so easy these days.

Ugh.  Well, at least we don't have to move it to another file, which was
going to be my next larger size of hammer.

Thanks for doing the legwork on this!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Jinhua Luo
And I also found the pg_stat_all_tables may be not so accurate.

testdb=# truncate test;
testdb=# select pg_stat_reset_single_table_counters(42515);
testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 0 | 0 | 0 | 0 |  0 |  0
(1 row)

# run application a while

testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 24829 | 24839 | 0 | 0 |  24829 |  24839
(1 row)

testdb=# select count(*) from test;
 count
---
 24780
(1 row)

testdb=# vacuum verbose test;
...
DETAIL:  24780 index row versions were removed.
...
INFO:  "test": found 863 removable, 24780 nonremovable row versions in
3148 out of 3148 pages


The n_tup_ins is bigger than actual rows, and the n_tup_upd is even
bigger than n_tup_ins!

Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert/update performance

2016-01-23 Thread Jinhua Luo
2016-01-23 23:00 GMT+08:00 Tom Lane :
> Jinhua Luo  writes:
>> I have a table with 70 columns, and 6 indexes. The data flow is a
>> special OLTP model: frequent inserts (2000 tps), and each inserted row
>> would be updated very soon (i.e. the number of inserts is equal to the
>> number of updates).
>
> Do those predictable updates change any of the indexed columns?

The update statement itself do not modify the indexed columns, but the
before update trigger modifies two indexed columns: one is in
timestamp type, used to record the update time, the trigger would fill
CURRENT_TIMESTAMP into this column; the other is status, which would
be set to 'done'. These two columns are indexed in btree.

>
>> I do a simple test: I truncate the table, disable the autovacuum, and
>> run the application for a few minutes, then I invokes vacuum manually,
>> it gives a strange output:
>> found 598 removable, 25662 nonremovable row versions in 3476 pages
>> DETAIL:  0 dead row versions cannot be removed yet
>> As said before, the number of inserts is equal to the number of
>> updates. So the bloat of the table should be 100%, and the number of
>> removable rows should be equal to the number of nonremovable rows,
>> which is the real number of inserts issued by the application.
>
> What seems likely is that most of the updates are HOT (because they
> don't change any indexed columns) and then the freed space is reclaimable
> by subsequent updates on the same page without needing a VACUUM.
>
> Watching the insert/update/hot-update counts in pg_stat_all_tables would
> provide some evidence.

testdb=# truncate test;
TRUNCATE TABLE
testdb=# vacuum test;
testdb=# select pg_stat_reset_single_table_counters(42515);
 pg_stat_reset_single_table_counters
-

(1 row)

testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 0 | 0 | 0 | 0 |  0 |  0
(1 row)

#insert 6 rows

testdb=# select
n_tup_ins,n_tup_upd,n_tup_hot_upd,n_tup_del,n_live_tup,n_dead_tup from
pg_stat_all_tables where relid=42515;
 n_tup_ins | n_tup_upd | n_tup_hot_upd | n_tup_del | n_live_tup | n_dead_tup
---+---+---+---++
 6 | 6 | 0 | 0 |  6 |  6
(1 row)

testdb=# vacuum verbose test;
INFO:  vacuuming "public.test"
INFO:  scanned index "test_pkey" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "deliver_done_date_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "deliver_task_queue_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "message_id_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "status_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "status_report_done_date_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "submit_done_date_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  scanned index "tp_scts_idx" to remove 6 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "test": removed 6 row versions in 1 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "test_pkey" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "deliver_done_date_idx" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "deliver_task_queue_idx" now contains 0 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "message_id_idx" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "status_idx" now contains 6 row versions in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "status_report_done_date_idx" now contains 6 row versions
in 2 pages
DETAIL:  6 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "submit_done_date_idx" now contains 6 row 

[HACKERS] is there a deep unyielding reason to limit U&'' literals to ASCII?

2016-01-23 Thread Chapman Flack
I see in the documentation (and confirm in practice) that a
Unicode character string literal U&'...' is only allowed to have
s representing Unicode characters if the
server encoding is, exactly and only, UTF8.

Otherwise, it can still have s, but they can only
be in the range \+01 to \+7f and can only represent ASCII characters
... and this isn't just for an ASCII server encoding but for _any server
encoding other than UTF8_.

I'm a newcomer here, so maybe there was an existing long conversation
where that was determined to be necessary for some deep reason, and I
just need to be pointed to it.

What I would have expected would be to allow s
for any Unicode codepoint that's representable in the server encoding,
whatever encoding that is. Indeed, that's how I read the SQL standard
(or my scrounged 2006 draft of it, anyway). The standard even lets
you precede U& with _charsetname and have the escapes be allowed to
be any character representable in the specified charset. *That*, I assume,
would be tough to implement in PostgreSQL, since strings don't walk
around with their own personal charsets attached. But what's the reason
for not being able to mention characters available in the server encoding?

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers