On 17 December 2014 at 05:25, Teodor Sigaev <teo...@sigaev.ru> wrote: > > I've been having a look at this and I'm wondering about a certain scenario: >> >> In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block, >> and on >> the next iteration of the loop we have the same block again, have you >> benchmarked any caching code to store if tbm_page_is_lossy() returned >> true for >> that block on the previous iteration of the loop? This would save from >> having to >> call tbm_page_is_lossy() again for the same block. Or are you just >> expecting >> that tbm_page_is_lossy() returns true so rarely that you'll end up >> caching the >> page most of the time, and gain on skipping both hash lookups on the next >> loop, >> since page will be set in this case? >> > I believe that if we fall in lossy pages then tidbitmap will not have a > significant impact on preformance because postgres will spend a lot of > time on tuple rechecking on page. If work_mem is to small to keep exact > tidbitmap then postgres will significantly slowdown. I implemented it, > (v2.1 in attachs) but > I don't think that is an improvement, at least significant improvement. > > You could well be right, but it would be good to compare the numbers just so we know this for sure.
There seems to be a problem with the v2.1. The code does not look quite right, if have expected something more like: if (lossy_page == blk) continue; /* whole page is already marked */ if (page == NULL || page->blockno != blk) where you have the lossy_page check within the 2nd if test. I'd imagine this will never be true as page->blockno != blk, or perhaps it'll only be true when tbm_lossify() is called and you set the page to NULL. There's also something weird going on using your original test case: postgres=# set work_mem = '256MB'; SET Time: 0.450 ms postgres=# select count(*) from t where i >= 0; count --------- 5000000 (1 row) That's ok Time: 1369.562 ms postgres=# set work_mem = '64kB'; SET Time: 0.425 ms postgres=# select count(*) from t where i >= 0; count --------- 4999998 (1 row) Time: 892.726 ms Notice the row counts are not the same. create table t_result (i int not null); postgres=# select a.a,a.b,b.a,b.b from (select i a,count(*) b from t group by i) a full outer join (select i a,count(*) b from t_result group by i) b on a.a = b.a where b.b <> a.b; a | b | a | b --------+----+--------+--- 315744 | 10 | 315744 | 8 (1 row) postgres=# select ctid from t where i = 315744; ctid ------------- (13970,220) (13970,221) (13970,222) (13970,223) (13970,224) (13970,225) (13970,226) (13971,1) (13971,2) (13971,3) (10 rows) This only seems to be broken in the v2.1. Are you seeing the same? >> It would be nice to see a comment to explain why it might be a good idea >> to >> cache the page lookup. Perhaps something like: >> >> > added, see attachment (v2) > > Thanks. That looks better. It would be good to compare performance of v2 and v2.1, but I think the problem with v2.1 needs to be fixed before that can happen. > >> I also wondered if there might be a small slowdown in the case where the >> index >> only finds 1 matching tuple. So I tried the following: >> avg.2372.4456 2381.909 99.6% >> med.2371.224 2359.494 100.5% >> >> It appears that if it does, then it's not very much. >> > > I believe, that's unmeasurable because standard deviation of your tests is > about 2% what is greater that difference between pathed and master versions. > > Agreed. Regards David Rowley