On Mon, Apr 15, 2024 at 12:41 PM Nazir Bilal Yavuz <byavu...@gmail.com> wrote: > I agree with you but I did not understand one thing. If out-of-core > AMs are used, does not all block sampling logic (BlockSampler_Init(), > BlockSampler_Next() etc.) need to be edited as well since these > functions assume block numbers are actual physical on-disk location, > right? I mean if the block number is something different than the > actual physical on-disk location, the acquire_sample_rows() function > looks wrong to me before c6fc50cb4028 as well.
Yes, this is also a problem with trying to use non-physical block numbers. We can hypothesize an AM where it works out OK in practice, say because there are always exactly the same number of logical block numbers as there are physical block numbers. Or, because there are always more logical block numbers than physical block numbers, but for some reason the table AM author doesn't care because they believe that in the target use case for their AM the data distribution will be sufficiently uniform that sampling only low-numbered blocks won't really hurt anything. But that does seem a bit strained. In practice, I suspect that table AMs that use logical block numbers might want to replace this line from acquire_sample_rows() with a call to a tableam method that returns the number of logical blocks: totalblocks = RelationGetNumberOfBlocks(onerel); But even that does not seem like enough, because my guess would be that a lot of table AMs would end up with a sparse logical block space. For instance, you might create a logical block number sequence that starts at 0 and just counts up towards 2^32 and eventually either wraps around or errors out. Each new tuple gets the next TID that isn't yet used. Well, what's going to happen eventually in a lot of workloads is that the low-numbered logical blocks are going to be mostly or entirely empty, and the data is going to be clustered in the ones that are nearer to the highest logical block number that's so far been assigned. So, then, as you say, you'd want to replace the whole BlockSampler thing entirely. That said, I find it a little bit hard to know what people are already doing or realistically might try to do with table AMs. If somebody says they have a table AM where the number of logical block numbers equals the number of physical block numbers (or is somewhat larger but in a way that doesn't really matter) and the existing block sampling logic works well enough, I can't really disprove that. It puts awfully tight limits on what the AM can be doing, but, OK, sometimes people want to develop AMs for very specific purposes. However, because of the prefetching thing, I think even that fairly narrow use case was already broken before 041b96802efa33d2bc9456f2ad946976b92b5ae1. So I just don't really see how that commit made anything worse in any way that really matters. But maybe it did. People often find extremely creative ways of working around the limitations of the core interfaces. I think it could be the case that someone found a clever way of dodging all of these problems and had something that was working well enough that they were happy with it, and now they can't make it work after the changes for some reason. If that someone is reading this thread and wants to spell that out, we can consider whether there's some relief that we could give to that person, *especially* if they can demonstrate that they raised the alarm before the commit went in. But in the absence of that, my current belief is that nonphysical block numbers were never a supported scenario; hence, the idea that 041b96802efa33d2bc9456f2ad946976b92b5ae1 should be reverted for de-supporting them ought to be rejected. -- Robert Haas EDB: http://www.enterprisedb.com