[DOCS] raise not in index

2005-02-09 Thread Rainer Brandt
Hi,

a suggestion for a small improvement in the 8.0 manual:

The term 'raise' (see section 35.9) is not in the index.

BTW, I like your documentation.
I don't remember any open source project with better docs
(I mean style, correctness, completeness, grammar, orthography,
presentation, and keeping it up to date), not even TeX.

Rainer
-- 
--
Rainer J. H. Brandtemail: [EMAIL PROTECTED]
Brandt & Brandt Computer GmbH  web:www.bb-c.de
Kamberg 111phone:  +49 2448 919126
D 53940 Hellenthal mobile: +49 172 9593205

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [DOCS] How the planner uses statistics

2005-02-09 Thread Mark Kirkwood
Post feedback changes - thanks to all who commented!
Mark Kirkwood wrote:
I wanted to understand how the planner 'knows' how many rows are likely
to be emitted in a given stage of a query, and wrote down some examples
for my own benefit - I then wondered if this would be a good addition to
the 'Performance Tips' chapter. So... err here it is.
Comments welcome.

--- perform.sgml.orig   Sat Feb  5 12:45:36 2005
+++ perform.sgmlTue Feb  8 17:15:48 2005
@@ -470,6 +470,286 @@
 
  
 
+
+ 
+  How the Planner Uses Statistics
+
+  
+   statistics
+   of the planner
+  
+
+  
+   This section builds on the material covered in the previous two and 
+   shows how the planner uses the system statistics to estimate the number of
+   rows each stage of a query might return. We will adopt the approach of 
+   showing by example, which should provide a good feel for how this works.
+  
+
+  
+   Continuing with the examples drawn from the regression test
+   database (and 8.0 sources), let's start with a simple query which has 
+   one restriction in its WHERE clause:
+   
+
+EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000;
+
+ QUERY PLAN
+
+ Seq Scan on tenk1  (cost=0.00..470.00 rows=1031 width=244)
+   Filter: (unique1 < 1000)
+
+
+
+   The planner examines the WHERE clause condition: 
+
+
+unique1 < 1000
+
+
+   and looks up the restriction function for the operator 
+   < in pg_operator. 
+   This is held in the column oprrest, 
+   and the result in this case is scalarltsel.
+   The scalarltsel function retrieves the histogram for 
+   unique1 from pg_statistics
+   - we can follow this by using the simpler pg_stats 
+   view:
+
+
+SELECT histogram_bounds FROM pg_stats 
+WHERE tablename='tenk1' AND attname='unique1';
+
+   histogram_bounds
+--
+ {1,970,1943,2958,3971,5069,6028,7007,7919,8982,9995}
+
+
+   Next the fraction of the histogram occupied by < 1000 
+   is worked out. This is the selectivity. The histogram divides the range 
+   into equal frequency buckets, so all we have to do is locate the bucket 
+   that our value is in and count part of it and 
+   all of the ones before. The value 1000 is clearly in 
+   the second (970 - 1943) bucket, so by assuming a linear distribution of 
+   values inside each bucket we can calculate the selectivity as:
+
+
+selectivity = (1 + (1000 - 970)/(1943 - 970)) / 10
+= 0.1031
+
+
+   that is, one whole bucket plus a linear fraction of the second, divided by
+   the number of buckets. The estimated number of rows can now be calculated as
+   the product of the selectivity and the cardinality of 
+   tenk1:
+
+
+rows = 1 * 0.1031
+ = 1031
+
+
+  
+
+  
+   Next let's consider an example with a WHERE clause using
+   the = operator:
+
+
+EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'AT';
+
+QUERY PLAN
+--
+ Seq Scan on tenk1  (cost=0.00..470.00 rows=31 width=244)
+   Filter: (stringu1 = 'AT'::name)
+
+
+   Again the planner examines the WHERE clause condition: 
+
+
+stringu1 = 'AT'
+
+
+   and looks up the restriction function for =, which is 
+   eqsel. This case is a bit different, as the most
+   common values — MCVs, are used to determine the 
+   selectivity. Let's have a look at these, with some extra columns that will
+   be useful later:
+
+
+SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM 
pg_stats 
+WHERE tablename='tenk1' AND attname='stringu1';
+
+null_frac | 0
+n_distinct| 672
+most_common_vals  | 
{FD,NH,AT,BG,EB,MO,ND,OW,BH,BJ}
+most_common_freqs | 
{0.0033,0.0033,0.003,0.003,0.003,0.003,0.003,0.003,0.0027,0.0027}
+
+
+   The selectivity is merely the frequency corresponding to 'AT': 
+
+
+selectivity = 0.003
+
+
+   The estimated number of rows is just the product of this with the 
+   cardinality of tenk1 as before:
+
+
+rows = 1 * 0.003
+ = 30
+
+
+   The number displayed by EXPLAIN is one more than this,
+   due to some post estimation checks.
+  
+
+  
+   Now consider the same query, but with a constant that is not in the 
+   MCV list:
+
+
+EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx';
+
+QUERY PLAN
+--
+ Seq Scan on tenk1  (cost=0.00..470.00 rows=15 width=244)
+   Filter: (stringu1 = 'xxx'::name)
+
+
+   This is quite a different problem, how to estimate the selectivity when the
+   value is not in the MCV list. 
+   The approach is to use the fact that the value is not in the list, 
+   combined with the knowledge of the frequencies for all of the
+   MCVs:
+
+
+selectivity = (1.0 - (0.0033 + 0.0033 + 0.003 + 0.003 + 0.003 
++ 0.003 + 0.003 + 0.003 + 0.0027 + 0.0027

Re: [DOCS] raise not in index

2005-02-09 Thread Neil Conway
On Wed, 2005-02-09 at 15:38 +0100, Rainer Brandt wrote:
> a suggestion for a small improvement in the 8.0 manual:
> 
> The term 'raise' (see section 35.9) is not in the index.

Thanks for the suggestion; I've applied the attached patch to CVS HEAD.

-Neil

Index: doc/src/sgml/plpgsql.sgml
===
RCS file: /var/lib/cvs/pgsql/doc/src/sgml/plpgsql.sgml,v
retrieving revision 1.59
diff -c -r1.59 plpgsql.sgml
*** doc/src/sgml/plpgsql.sgml	10 Feb 2005 05:01:07 -	1.59
--- doc/src/sgml/plpgsql.sgml	10 Feb 2005 05:21:57 -
***
*** 1901,1906 
--- 1901,1911 
 
  Trapping Errors
  
+ 
+  exceptions
+  in PL/PgSQL
+ 
+ 
  
   By default, any error occurring in a PL/pgSQL
   function aborts execution of the function, and indeed of the
***
*** 2339,2344 
--- 2344,2358 

 Errors and Messages
  
+
+ RAISE
+
+ 
+
+ reporting errors
+ in PL/PgSQL
+
+ 
 
  Use the RAISE statement to report messages and
  raise errors.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])