Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-09 Thread Grzegorz Jaskiewicz

It would be interested to see some code here.
Maybe this would be a great oportunity to start - some sort of 'auto- 
tune' (as an option), in the area.



--
GJ

C/C++/SQL freelance to hire.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-09 Thread Martijn van Oosterhout
On Thu, Mar 08, 2007 at 07:01:17PM -0500, Umar Farooq Minhas wrote:
> displayed, I want cpu cost and io cost displayed separated when i run
> EXPLAIN on a particular query. Till now I haven't been able to figure
> out a 'clean' way of doing this. Can anyone tell me how much time
> should I expect to spend making such a change ? and from where should
> I start ? costsize.c ?

That's going to be a lot of work. You need to duplicate the variable
and eery usage of that variable. And I can't imagine why you'd be
interested anyway...

> I have another question. Looking at the optimizer code, it pretty
> much looks insensitive to the memory factor. The only parameters
> being utilized are the "effective_cache_size" ( in estimating index
> cost only) and "work_mem" for (sort, aggregation, groups, hash/merge
> joins). Are these the only memory factors that DIRECTLY effect the
> cost estimates of the planner/optimizer?

Sure, what other factors were you considering?

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Umar Farooq Minhas
Re: [HACKERS] Estimating seq_page_fetch and random_page_fetchThanks a lot for 
your replies. The suggestions have proved much useful.
Ayush, I'm curious to see your C program, thanks.

Here is a related but different issue. I started looking at the postgres 
optimizer/planner code a month back to modify it for the purposes of 
experiments that I need to conduct. The EXPLAIN command prints the total costs 
i.e both CPU + I/O however, for my purposes I need these two costs to be 
separated i.e. instead of getting one cost displayed, I want cpu cost and io 
cost displayed separated when i run EXPLAIN on a particular query. Till now I 
haven't been able to figure out a 'clean' way of doing this. Can anyone tell me 
how much time should I expect to spend making such a change ? and from where 
should I start ? costsize.c ?

I have another question. Looking at the optimizer code, it pretty much looks 
insensitive to the memory factor. The only parameters being utilized are the 
"effective_cache_size" ( in estimating index cost only) and "work_mem" for 
(sort, aggregation, groups, hash/merge joins). Are these the only memory 
factors that DIRECTLY effect the cost estimates of the planner/optimizer?

Again your help is appreciated.

-Umar
  - Original Message - 
  From: Luke Lonergan 
  To: Gregory Stark ; Tom Lane ; Ayush Parashar 
  Cc: Umar Farooq Minhas ; pgsql-hackers@postgresql.org 
  Sent: Thursday, March 08, 2007 2:16 PM
  Subject: Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch


  Adding to this:

  Ayush recently wrote a C program that emulates PG IO to do this analysis, and 
we came out with (predictably) a ratio of sequential/random of 20-50 (for a 
single user).  This is predictable because the random component is fixed at the 
access time of a single hard drive no matter how many disks are in an array, 
while the sequential scales nearly linearly with the number of drives in the 
array.

  So, you can estimate random using 8-12ms per random access, and sequential as 
1/(number of disks X 60-130MB/s).

  Ayush, can you forward your C program?

  - Luke

  Msg is shrt cuz m on ma treo

   -Original Message-
  From:   Gregory Stark [mailto:[EMAIL PROTECTED]
  Sent:   Thursday, March 08, 2007 12:37 PM Eastern Standard Time
  To: Tom Lane
  Cc: Umar Farooq Minhas; pgsql-hackers@postgresql.org
  Subject:    Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch


  "Tom Lane" <[EMAIL PROTECTED]> writes:

  > "Umar Farooq Minhas" <[EMAIL PROTECTED]> writes:
  >> How can we accrately estimate the "seq_page_fetch" and =
  >> "random_page_fetch" costs from outside the postgres using for example a =
  >> C routine.
  >
  > Use a test case larger than memory.  Repeat many times to average out
  > noise.  IIRC, when I did the experiments that led to the current
  > random_page_cost of 4.0, it took about a week before I had numbers I
  > trusted.

  When I was running tests I did it on a filesystem where nothing else was
  running. Between tests I unmounted and remounted it. As I understand it Linux
  associates the cache with the filesystem and not the block device and discards
  all pages from cache when the filesystem is unmounted.

  That doesn't contradict anything Tom said, it might be useful as an additional
  tool though.

  --
Gregory Stark
EnterpriseDB  http://www.enterprisedb.com

  ---(end of broadcast)---
  TIP 3: Have you checked our extensive FAQ?

 http://www.postgresql.org/docs/faq




Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Neil Conway
On Thu, 2007-03-08 at 17:35 +, Gregory Stark wrote:
> When I was running tests I did it on a filesystem where nothing else was
> running. Between tests I unmounted and remounted it. As I understand it Linux
> associates the cache with the filesystem and not the block device and discards
> all pages from cache when the filesystem is unmounted.

On recent Linux kernels, /proc/sys/vm/drop_caches can also be useful:

http://linux.inet.hr/proc_sys_vm_drop_caches.html

You could also use posix_fadvise() to achieve a similar effect on a per-file
basis.

-Neil



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Jim C. Nasby
On Thu, Mar 08, 2007 at 05:35:03PM +, Gregory Stark wrote:
> 
> "Tom Lane" <[EMAIL PROTECTED]> writes:
> 
> > "Umar Farooq Minhas" <[EMAIL PROTECTED]> writes:
> >> How can we accrately estimate the "seq_page_fetch" and =
> >> "random_page_fetch" costs from outside the postgres using for example a =
> >> C routine.
> >
> > Use a test case larger than memory.  Repeat many times to average out
> > noise.  IIRC, when I did the experiments that led to the current
> > random_page_cost of 4.0, it took about a week before I had numbers I
> > trusted.
> 
> When I was running tests I did it on a filesystem where nothing else was
> running. Between tests I unmounted and remounted it. As I understand it Linux
> associates the cache with the filesystem and not the block device and discards
> all pages from cache when the filesystem is unmounted.
> 
> That doesn't contradict anything Tom said, it might be useful as an additional
> tool though.

Another trick I've used in the past is to just run the machine out of
memory, using the following:

/*
 * $Id: clearmem.c,v 1.1 2003/06/29 20:41:33 decibel Exp $
 *
 * Utility to clear out a chunk of memory and zero it. Useful for flushing disk 
buffers
 */

int main(int argc, char *argv[]) {
if (!calloc(atoi(argv[1]), 1024*1024)) { printf("Error allocating 
memory.\n"); }
}

I'll monitor top while that's running to ensure that some stuff gets
swapped out to disk. I believe this might still leave some cached data
in other areas of the kernel, but it's probably not enough to worry
about.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Luke Lonergan
Adding to this:

Ayush recently wrote a C program that emulates PG IO to do this analysis, and 
we came out with (predictably) a ratio of sequential/random of 20-50 (for a 
single user).  This is predictable because the random component is fixed at the 
access time of a single hard drive no matter how many disks are in an array, 
while the sequential scales nearly linearly with the number of drives in the 
array.

So, you can estimate random using 8-12ms per random access, and sequential as 
1/(number of disks X 60-130MB/s).

Ayush, can you forward your C program?

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Gregory Stark [mailto:[EMAIL PROTECTED]
Sent:   Thursday, March 08, 2007 12:37 PM Eastern Standard Time
To: Tom Lane
Cc: Umar Farooq Minhas; pgsql-hackers@postgresql.org
Subject:Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch


"Tom Lane" <[EMAIL PROTECTED]> writes:

> "Umar Farooq Minhas" <[EMAIL PROTECTED]> writes:
>> How can we accrately estimate the "seq_page_fetch" and =
>> "random_page_fetch" costs from outside the postgres using for example a =
>> C routine.
>
> Use a test case larger than memory.  Repeat many times to average out
> noise.  IIRC, when I did the experiments that led to the current
> random_page_cost of 4.0, it took about a week before I had numbers I
> trusted.

When I was running tests I did it on a filesystem where nothing else was
running. Between tests I unmounted and remounted it. As I understand it Linux
associates the cache with the filesystem and not the block device and discards
all pages from cache when the filesystem is unmounted.

That doesn't contradict anything Tom said, it might be useful as an additional
tool though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq



Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> "Umar Farooq Minhas" <[EMAIL PROTECTED]> writes:
>> How can we accrately estimate the "seq_page_fetch" and =
>> "random_page_fetch" costs from outside the postgres using for example a =
>> C routine.
>
> Use a test case larger than memory.  Repeat many times to average out
> noise.  IIRC, when I did the experiments that led to the current
> random_page_cost of 4.0, it took about a week before I had numbers I
> trusted.

When I was running tests I did it on a filesystem where nothing else was
running. Between tests I unmounted and remounted it. As I understand it Linux
associates the cache with the filesystem and not the block device and discards
all pages from cache when the filesystem is unmounted.

That doesn't contradict anything Tom said, it might be useful as an additional
tool though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Tom Lane
"Umar Farooq Minhas" <[EMAIL PROTECTED]> writes:
> How can we accrately estimate the "seq_page_fetch" and =
> "random_page_fetch" costs from outside the postgres using for example a =
> C routine.

Use a test case larger than memory.  Repeat many times to average out
noise.  IIRC, when I did the experiments that led to the current
random_page_cost of 4.0, it took about a week before I had numbers I
trusted.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Estimating seq_page_fetch and random_page_fetch

2007-03-08 Thread Umar Farooq Minhas
Hi,

How can we accrately estimate the "seq_page_fetch" and "random_page_fetch" 
costs from outside the postgres using for example a C routine. Using a very 
simple program in C, I have two sets of files. I want to access one set 
randomly i.e. pulling data from random locations within the files. The second 
set of files is accessed sequentially. The goal here is to approximate the disk 
I/O cost for a "random page fetch" and a "sequential page fetch" respectively. 
I am using low-level(unbuffered) C routines i.e. read/write and lseek ( for 
positioning file pointer), the read/write buffer size is 8k (to match the size 
of postgres page), and Linux is the host OS. We all know that linux is a 
heavily cached OS, for that very reason I am using sets of files instead of a 
single file, in a hope that whenever a new file from a set of files is accessed 
for the first time, it will NOT be in the OS cache, thus giving accurate 
results of actually fetching the file pages from the physical disk. And also 
the host is restarted before running the experiment so as to force a cold-cache 
start.

I am hoping somebody could point me in the right direction.

Thanks

-Umar