On Wed, Mar 12, 2014 at 8:29 PM, Heikki Linnakangas <hlinnakan...@vmware.com
> On 03/12/2014 12:09 AM, Tomas Vondra wrote:
>> Hi all,
>> a quick question that just occured to me - do you plan to tweak the cost
>> estimation fot GIN indexes, in this patch?
>> IMHO it would be appropriate, given the improvements and gains, but it
>> seems to me gincostestimate() was not touched by this patch.
> Good point. We have done two major changes to GIN in this release cycle:
> changed the data page format and made it possible to skip items without
> fetching all the keys ("fast scan"). gincostestimate doesn't know about
> either change.
> Adjusting gincostestimate for the more compact data page format seems
> easy. When I hacked on that, I assumed all along that gincostestimate
> doesn't need to be changed as the index will just be smaller, which will be
> taken into account automatically. But now that I look at gincostestimate,
> it assumes that the size of one item on a posting tree page is a constant 6
> bytes (SizeOfIptrData), which is no longer true. I'll go fix that.
> Adjusting for the effects of skipping is harder. gincostestimate needs to
> do the same preparation steps as startScanKey: sort the query keys by
> frequency, and call consistent function to split the keys intao "required"
> and "additional" sets. And then model that the "additional" entries only
> need to be fetched when the other keys match. That's doable in principle,
> but requires a bunch of extra code.
> Alexander, any thoughts on that? It's getting awfully late to add new code
> for that, but it sure would be nice somehow take fast scan into account.
Preparation we do in startScanKey requires knowledge of estimate size of
posting lists/trees. We do this estimate by traversal to leaf pages. I
think gincostestimate is expected to be way more cheap. So, we probably
need so more rough estimate there, don't we?
With best regards,