On Thu, Jan 23, 2014 at 8:22 PM, Heikki Linnakangas <hlinnakan...@vmware.com
> On 01/14/2014 05:35 PM, Alexander Korotkov wrote:
>> Attached version is rebased against last version of packed posting lists.
> I think we're missing a trick with multi-key queries. We know that when
> multiple scan keys are used, they are ANDed together, so we can do the skip
> optimization even without the new tri-state consistent function.
> To get started, I propose the three attached patches. These only implement
> the optimization for the multi-key case, which doesn't require any changes
> to the consistent functions and hence no catalog changes. Admittedly this
> isn't anywhere near as useful in practice as the single key case, but let's
> go for the low-hanging fruit first. This nevertheless introduces some
> machinery that will be needed by the full patch anyway.
> I structured the code somewhat differently than your patch. There is no
> separate fast-path for the case where the optimization applies. Instead,
> I'm passing the advancePast variable all the way down to where the next
> batch of items are loaded from the posting tree. keyGetItem is now
> responsible for advancing the entry streams, and the logic in scanGetItem
> has been refactored so that it advances advancePast aggressively, as soon
> as one of the key streams let us conclude that no items < a certain point
> can match.
> scanGetItem might yet need to be refactored when we get to the full
> preconsistent check stuff, but one step at a time.
> The first patch is the most interesting one, and contains the scanGetItem
> changes. The second patch allows seeking to the right segment in a posting
> tree page, and the third allows starting the posting tree scan from root,
> when skipping items (instead of just following the right-links).
> Here are some simple performance test results, demonstrating the effect of
> each of these patches. This is a best-case scenario. I don't think these
> patches has any adverse effects even in the worst-case scenario, although I
> haven't actually tried hard to measure that. The used this to create a test
> create table foo (intarr int);
> -- Every row contains 0 (frequent term), and a unique number.
> insert into foo select array[0,g] from generate_series(1, 10000000) g;
> -- Add another tuple with 0, 1 combo physically to the end of the table.
> insert into foo values (array[0,1]);
> The query I used is this:
> postgres=# select count(*) from foo where intarr @> array and intarr @>
> (1 row)
> I measured the time that query takes, and the number of pages hit, using
> "explain (analyze, buffers true) ...".
> patches time (ms) buffers
> unpatched 650 1316
> patch 1 0.52 1316
> patches 1+2 0.50 1316
> patches 1+2+3 0.13 15
> So, the second patch isn't doing much in this particular case. But it's
> trivial, and I think it will make a difference in other queries where you
> have the opportunity skip, but return a lot of tuples overall.
> In summary, these are fairly small patches, and useful on their, so I
> think these should be committed now. But please take a look and see if the
> logic in scanGetItem/keyGetItem looks correct to you. After this, I think
> the main fast scan logic will go into keyGetItem.
Good, thanks! Now I can reimplement fast scan basing on this patches.
PS. I find it a bit surprising that in your patch, you're completely
> bailing out if there are any partial-match keys involved. Is there some
> fundamental reason for that, or just not implemented?
Just not implemented. I think there is two possible approaches to handle it:
1) Handle partial-match keys like OR on matching keys.
2) Implement keyAdvancePast for bitmap.
First approach seems to be useful with low number of keys. Probably, we
should implement automatic switching between them.
With best regards,