date:20050227

Re: [DOCS] How the planner uses statistics

2005-02-27 Thread Bruce Momjian


Patch applied.  Thanks.  Your documentation changes can be viewed in
five minutes using links on the developer's page,
http://www.postgresql.org/developer/testing.


---


Mark Kirkwood wrote:
> Bruce Momjian wrote:
> > Mark Kirkwood wrote:
> >
> >>At Tom's suggestion, I am going to amend the page to fit into the
> >>'internals' chapter as opposed to 'performance tips' one. I might do
> >>this first, and send you the resulting page.
> >  
> > That sounds good, that this become part of the developer docs.
> >
> 
> Here is the amended version. I have placed it in its own chapter located
> immediately after 'bki Backend Interface', however there is nothing
> special about that location... feel free to move it around :-)
> 
> Mark
> 
> 
> 
> 
> 

> diff -Naur sgml.orig/filelist.sgml sgml/filelist.sgml
> --- sgml.orig/filelist.sgml   Mon Feb 14 15:02:16 2005
> +++ sgml/filelist.sgmlTue Feb 15 09:52:33 2005
> @@ -77,6 +77,7 @@
>  
>  
>  
> +
>  
>  
>  
> diff -Naur sgml.orig/howplanstats.sgml sgml/howplanstats.sgml
> --- sgml.orig/howplanstats.sgml   Thu Jan  1 12:00:00 1970
> +++ sgml/howplanstats.sgmlTue Feb 15 17:18:30 2005
> @@ -0,0 +1,370 @@
> +
> +
> +
> + How the Planner Uses Statistics
> +
> +  
> +   This chapter builds on the material covered in  linkend="using-explain">
> +   and , and shows how the planner uses the 
> +   system statistics to estimate the number of rows each stage in a query 
> might
> +   return. This is a significant part of the planning / optimizing process,
> +   providing much of the raw material for cost calculation.
> +  
> +
> +  
> +   The intent of this chapter is not to document the code — 
> +   better done in the code itself, but to present an overview of how it 
> works.
> +   This will perhaps ease the learning curve for someone who subsequently 
> +   wishes to read the code. As a consequence, the approach chosen is to 
> analyze
> +   a series of incrementally more complex examples. 
> +  
> +
> +  
> +   The outputs and algorithms shown below are taken from version 8.0. 
> +   The behaviour of earlier (or later) versions may vary.
> +  
> +
> + 
> +  Row Estimation Examples
> +
> +  
> +   row estimation
> +   planner
> +  
> +
> +  
> +   Using examples drawn from the regression test database, let's start with 
> a 
> +   very simple query:
> +
> +EXPLAIN SELECT * FROM tenk1;
> +
> + QUERY PLAN
> +-
> + Seq Scan on tenk1  (cost=0.00..445.00 rows=1 width=244)
> +
> +   
> +   How the planner determines the cardinality of tenk1
> +   is covered in , but is repeated here for
> +   completeness. The number of rows is looked up from 
> +   pg_class:
> +
> +
> +SELECT reltuples, relpages FROM pg_class WHERE relname = 'tenk1';
> +
> + relpages | reltuples
> +--+---
> +  345 | 1
> +
> +   The planner will check the relpages estimate 
> +   (this is a cheap operation) and if incorrect may scale 
> +   reltuples to obtain a row estimate. In this 
> case it
> +   does not, thus:
> +
> +
> +rows = 1
> +
> +
> +  
> +   
> +  
> +   let's move on to an example with a range condition in its 
> +   WHERE clause:
> +
> +
> +EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000;
> +
> + QUERY PLAN
> +
> + Seq Scan on tenk1  (cost=0.00..470.00 rows=1031 width=244)
> +   Filter: (unique1 < 1000)
> +
> +
> +   The planner examines the WHERE clause condition: 
> +
> +
> +unique1 < 1000
> +
> +
> +   and looks up the restriction function for the operator 
> +   < in pg_operator. 
> +   This is held in the column oprrest, 
> +   and the result in this case is scalarltsel.
> +   The scalarltsel function retrieves the histogram for 
> +   unique1 from 
> pg_statistics
> +   - we can follow this by using the simpler pg_stats 
> +   view:
> +
> +
> +SELECT histogram_bounds FROM pg_stats 
> +WHERE tablename='tenk1' AND attname='unique1';
> +
> +   histogram_bounds
> +--
> + {1,970,1943,2958,3971,5069,6028,7007,7919,8982,9995}
> +
> +
> +   Next the fraction of the histogram occupied by < 1000 
> +   is worked out. This is the selectivity. The histogram divides the range 
> +   into equal frequency buckets, so all we have to do is locate the bucket 
> +   that our value is in and count part of it and 
> +   all of the ones before. The value 1000 is clearly in 
> +   the second (970 - 1943) bucket, so by assuming a linear distribution of 
> +   values inside each bucket we can calculate the selectivity as:
> +
> +
> +selectivity = (1 + (1000 - bckt[2].min)/(bckt[2].max - 
> bckt[2].min))/num_bckts
> += (1 + (1000 - 970)/(1943 - 970))/10
> += 0.1031
> +
> +
> +   that is, one whole bucket plus a linear fraction of the se

Re: [DOCS] Instructions for Linux ipc config

2005-02-27 Thread Bruce Momjian


Patch applied.  Thanks.

---


Mark Kirkwood wrote:
> Attached are my first thoughts for the amended instructions.
> 
> Mark Kirkwood wrote:
> >
> > But to be on the safe side, it would make sense to do something similar
> > to the BSD section, and comment about older distributions maybe needing
> > to manipulate /proc/kernel/* directly.
> > 
> 
> 

> --- runtime.sgml.orig Fri Feb  4 22:17:45 2005
> +++ runtime.sgml  Fri Feb  4 22:17:57 2005
> @@ -4414,37 +4414,29 @@
>LinuxIPC configuration
>
> 
> -The default shared memory limit (both
> -SHMMAX and SHMALL) is 32
> -MB in 2.2 kernels, but it can be changed in the
> -proc file system (without reboot).  For
> -example, to allow 128 MB:
> +The default settings are only suitable for small installations 
> +(the default max segment size is 32 MB). However the remaining
> +defaults are quite generously sized, and usually do not require
> +changes. The max segment size can be changed via the 
> +sysctl interface. For example, to allow 128 MB, 
> +and explicitly set the maximum total shared memory size to 2097152 
> +pages (the default):
>  
> -$ echo 134217728 
> >/proc/sys/kernel/shmall
> -$ echo 134217728 
> >/proc/sys/kernel/shmmax
> +$ systcl -w kernel.shmmax=134217728
> +$ systcl -w kernel.shmall=2097152
>  
> -You could put these commands into a script run at boot-time.
> -   
> -
> -   
> -Alternatively, you can use sysctl, if
> -available, to control these parameters.  Look for a file
> -called /etc/sysctl.conf and add lines
> -like the following to it:
> -
> -kernel.shmall = 134217728
> -kernel.shmmax = 134217728
> -
> -This file is usually processed at boot time, but
> -sysctl can also be called
> -explicitly later.
> +In addition these settings can be saved between reboots in 
> +/etc/sysctl.conf.
> 
>  
> 
> -Other parameters are sufficiently sized for any application. If
> -you want to see for yourself look in
> -
> /usr/src/linux/include/asm-xxx/shmparam.h
> -and /usr/src/linux/include/linux/sem.h.
> +Older distributions may not have the sysctl 
> program,
> +but equivalent changes can be made by manipulating the 
> +/proc filesystem:
> +
> +$ echo 134217728 
> >/proc/sys/kernel/shmmax
> +$ echo 2097152 
> >/proc/sys/kernel/shmall
> +
> 
>
>   
> 

> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [email protected]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [DOCS] How the planner uses statistics

Re: [DOCS] Instructions for Linux ipc config

2 matches

Site Navigation

Mail list logo

Footer information