Hi: > Different platforms would be good. Certainly, 1 platform isn't a good > enough indication that this is going to be useful.
I just have a different platforms at hand, Here is my test with Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz. shared_buffers has been set to big enough to hold all the data. columns Master Patched Improvement a 310.931 289.251 6.972608071 a2 329.577 299.975 8.981816085 a3 336.887 313.502 6.941496704 a4 352.099 325.345 7.598431123 a5 358.582 336.486 6.162049406 a6 375.004 349.12 6.902326375 a7 379.699 362.998 4.398484062 a8 391.911 371.41 5.231034597 a9 404.3 383.779 5.075686372 a10 425.48 396.114 6.901852026 a11 449.944 431.826 4.026723326 a12 461.876 443.579 3.961452857 a13 470.59 460.237 2.20000425 a14 483.332 467.078 3.362905829 a15 490.798 472.262 3.776706507 a16 503.321 484.322 3.774728255 By theory, Why does the preferch make thing better? I am asking this because I think we need to read the data from buffer to cache line once in either case (I'm obvious wrong in face of the test result.) Another simple point is the below styles are same. But the format 3 looks clearer than others for me. It can tell code reader more stuffs. just fyi. pg_prefetch_mem(PageGetItem((Page) dp, lpp)); pg_prefetch_mem(tuple->t_data); pg_prefetch_mem((scan->rs_ctup.t_data); -- Best Regards Andy Fan