Re: [HACKERS] Estimation of HashJoin Cost

2012-11-01 Thread Kevin Grittner
Qi Huang wrote:

> I need to estimate the hashjoin cost in my research.

> I looked at the code of final_cost_hashjoin() . It is not clear
> what factor it is considering. So, except the I/O to and from disk,
> what other factors are affecting the cost of hahsjoin?

http://www.postgresql.org/docs/9.2/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS

If you don't find it practical to read the code, you could run
ANALYZE of a query which uses a hashjoin with different cost factors
(these can be changed for your current connection with the SET
command) and observe the cost numbers in the output.

> Also, is there any way to force postgres abide on the estimation of
> Hashjoin cost as 3(R+S), which also means, to make hashjoin cost
> mainly spend on I/O?

How useful would that be for workloads where data is fully cached?

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Estimation of HashJoin Cost

2012-11-01 Thread Qi Huang
Hi, Dear HackersI need to estimate the hashjoin cost in my research. As the 
textbook shows, it is 3(R+S) where R and S are the size of the tablesize of the 
two tables, which realistically only considers the cost of IO. But this is 
obviously too theoretical. What is the correct way to estimate the cost of 
hashjoin? I looked at the code of final_cost_hashjoin() . It is not clear what 
factor it is considering. So, except the I/O to and from disk, what other 
factors are affecting the cost of hahsjoin? Also, is there any way to force 
postgres abide on the estimation of Hashjoin cost as 3(R+S), which also means, 
to make hashjoin cost mainly spend on I/O?
Thanks

Best RegardsHuang Qi VictorComputer Science of National University of Singapore