On Fri, Oct 6, 2017 at 7:24 PM, Dilip Kumar <dilipbal...@gmail.com> wrote:
> On Fri, Oct 6, 2017 at 6:08 PM, Alexander Kuzmenkov
> <a.kuzmen...@postgrespro.ru> wrote:
>>> Analysis: The estimated value of the lossy_pages is way higher than
>>> its actual value and reason is that the total_pages calculated by the
>>> "Mackert and Lohman formula" is not correct.
>> I think the problem might be that the total_pages includes cache effects and
>> rescans. For bitmap entries we should use something like relation pages *
>> selectivity.
> I have noticed that for the TPCH case if I use "pages * selectivity"
> it give me better results, but IMHO directly multiplying the pages
> with selectivity may not be the correct way to calculate the number of
> heap pages it can only give the correct result when all the TID being
> fetched are clustered.  But on the other hand "Mackert and Lohman
> formula" formulae consider that all the TID's are evenly distributed
> across the heap pages which can also give the wrong estimation like we
> are seeing in our TPCH case.

I agree with the point that the total_pages included the cache effects
and rescan when loop_count > 1, that can be avoided if we always
calculate heap_pages as it is calculated in the else part
(loop_count=0).  Fortunately, in all the TPCH query plan what I posted
up thread bitmap scan was never at the inner side of the NLJ so
loop_count was always 0.  I will fix this.

Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to