Re: [HACKERS] POC, WIP: OR-clause support for indexes
On 3/25/16 11:13 AM, David Steele wrote: Time is growing short and there seem to be some serious concerns with this patch. Can you provide a new patch soon? If not, I think it might be be time to mark this "returned with feedback". I have marked this patch "returned with feedback". Please feel free to resubmit for 9.7! Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
Hi Teador, On 3/19/16 8:44 PM, Tomas Vondra wrote: Sadly the v4 does not work for me - I do get assertion failures. Time is growing short and there seem to be some serious concerns with this patch. Can you provide a new patch soon? If not, I think it might be be time to mark this "returned with feedback". Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
Hi Teodor, Sadly the v4 does not work for me - I do get assertion failures. For example with the example Andreas Karlsson posted in this thread: CREATE EXTENSION btree_gin; CREATE TABLE test (a int, b int, c int); CREATE INDEX ON test USING gin (a, b, c); INSERT INTO test SELECT i % 7, i % 9, i % 11 FROM generate_series(1, 100) i; EXPLAIN ANALYZE SELECT * FROM test WHERE (a = 3 OR b = 5) AND c = 2; It seems working, but only until I run ANALYZE on the table. Once I do that, I start getting crashes at this line *qualcols = list_concat(*qualcols, list_copy(idx_path->indexqualcols)); in convert_bitmap_path_to_index_clause. Apparently one of the lists is T_List while the other one is T_IntList, so list_concat() errors out. My guess is that the T_BitmapOrPath branch should do oredqualcols = list_concat(oredqualcols, li_qualcols); ... *qualcols = list_concat(qualcols, oredqualcols); instead of oredqualcols = lappend(oredqualcols, li_qualcols); ... *qualcols = lappend(*qualcols, oredqualcols); but once I fixed that I got some other assert failures further down, that I haven't tried to fix. So the patch seems to be broken, and I suspect this might be related to the broken index condition reported by Andreas (although I don't see that - I either see correct condition or assertion failures). On 03/17/2016 06:19 PM, Teodor Sigaev wrote: ... 7) I find it rather ugly that the paths are built by converting BitmapOr paths. Firstly, it means indexes without amgetbitmap can't benefit from this change. Maybe that's reasonable limitation, though? I based on following thoughts: 1 code which tries to find OR-index path will be very similar to existing generate_or_bitmap code. Obviously, it should not be duplicated. 2 all existsing indexes have amgetbitmap method, only a few don't. amgetbitmap interface is simpler. Anyway, I can add an option for generate_or_bitmap to use any index, but, in current state it will just repeat all work. I agree that the code should not be duplicated, but is this really a good solution. Perhaps a refactoring that'd allow sharing most of the code would be more appropriate. But more importantly, this design already has a bunch of unintended consequences. For example, the current code completely ignores enable_indexscan setting, because it merely copies the costs from the bitmap path. > I'd like to add separate enable_indexorscan That may be useful, but why shouldn't enable_indexscan=off also disable indexorscan? I would find it rather surprising if after setting enable_indexscan=off I'd still get index scans for OR-clauses. That's pretty dubious, I guess. So this code probably needs to be made aware of enable_indexscan - right now it entirely ignores startup_cost in convert_bitmap_path_to_index_clause(). But of course if there are multiple IndexPaths, the enable_indexscan=off will be included multiple times. ... and it does not address this at all. I really doubt a costing derived from the bitmap index scan nodes will make much sense - you essentially need to revert unknown parts of the costing to only include building the bitmap once, etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
I also wonder whether the patch should add explanation of OR-clauses handling into the READMEs in src/backend/access/* Not yet, but will The patch would probably benefit from transforming it into a patch series - one patch for the infrastructure shared by all the indexes, then one patch per index type. That should make it easier to review, and I seriously doubt we'd want to commit this in one huge chunk anyway. I splitted to two: 1 0001-idx_or_core - only planner and executor changes 2 0002-idx_or_indexes - BRIN/GIN/GiST changes with tests I don't think that splitting of second patch adds readability but increase management diffculties, but if your insist I will split. 4) scanGetItem is a prime example of the "badly needs comments" issue, particularly because the previous version of the function actually had quite a lot of them while the new function has none. added -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ 0001-idx_or_core-v4.patch.gz Description: GNU Zip compressed data 0002-idx_or_indexes-v4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
I also wonder whether the patch should add explanation of OR-clauses handling into the READMEs in src/backend/access/* Oops, will add shortly. The patch would probably benefit from transforming it into a patch series - one patch for the infrastructure shared by all the indexes, then one patch per index type. That should make it easier to review, and I seriously doubt we'd want to commit this in one huge chunk anyway. Ok, will do it. 1) fields in BrinOpaque are not following the naming convention (all the existing fields start with bo_) fixed 2) there's plenty of places violating the usual code style (e.g. for single-command if branches) - not a big deal for WIP patch, but needs to get fixed eventually hope, fixed 3) I wonder whether we really need both SK_OR and SK_AND, considering they are mutually exclusive. Why not to assume SK_AND by default, and only use SK_OR? If we really need them, perhaps an assert making sure they are not set at the same time would be appropriate. In short: possible ambiguity and increasing stack machine complexity. Let we have follow expression in reversed polish notation (letters represent a condtion, | - OR, & - AND logical operation, ANDs are omitted): a b c | Is it ((a & b)| c) or (a & (b | c)) ? Also, using both SK_ makes code more readable. 4) scanGetItem is a prime example of the "badly needs comments" issue, particularly because the previous version of the function actually had quite a lot of them while the new function has none. Will add soon 5) scanGetItem() may end up using uninitialized 'cmp' - it only gets initialized when (!leftFinished && !rightFinished), but then gets used when either part of the condition evaluates to true. Probably should be if (!leftFinished || !rightFinished) cmp = ... fixed 6) the code in nodeIndexscan.c should not include call to abort() { abort(); elog(ERROR, "unsupported indexqual type: %d", (int) nodeTag(clause)); } fixed, just forgot to remove 7) I find it rather ugly that the paths are built by converting BitmapOr paths. Firstly, it means indexes without amgetbitmap can't benefit from this change. Maybe that's reasonable limitation, though? I based on following thoughts: 1 code which tries to find OR-index path will be very similar to existing generate_or_bitmap code. Obviously, it should not be duplicated. 2 all existsing indexes have amgetbitmap method, only a few don't. amgetbitmap interface is simpler. Anyway, I can add an option for generate_or_bitmap to use any index, but, in current state it will just repeat all work. But more importantly, this design already has a bunch of unintended consequences. For example, the current code completely ignores enable_indexscan setting, because it merely copies the costs from the bitmap path. I'd like to add separate enable_indexorscan That's pretty dubious, I guess. So this code probably needs to be made aware of enable_indexscan - right now it entirely ignores startup_cost in convert_bitmap_path_to_index_clause(). But of course if there are multiple IndexPaths, the enable_indexscan=off will be included multiple times. 9) This already breaks estimation for some reason. Consider this ... So the OR-clause is estimated to match 0 rows, less than each of the clauses independently. Needless to say that without the patch this works just fine. fixed 10) Also, this already breaks some regression tests, apparently because it changes how 'width' is computed. fixed too So I think this way of building the index path from a BitmapOr path is pretty much a dead-end. I don't think so because separate code path to support OR-clause in index will significanlty duplicate BitmapOr generator. Will send next version as soon as possible. Thank you for your attention! -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ index_or-3.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
I gave this patch a quick spin and noticed a strange query plan. CREATE TABLE test (a int, b int, c int); CREATE INDEX ON test USING gin (a, b, c); INSERT INTO test SELECT i % 7, i % 9, i % 11 FROM generate_series(1, 100) i; EXPLAIN ANALYZE SELECT * FROM test WHERE (a = 3 OR b = 5) AND c = 2; QUERY PLAN -- Bitmap Heap Scan on test (cost=829.45..4892.10 rows=21819 width=12) (actual time=66.494..76.234 rows=21645 loops=1) Recheck Cond: a = 3) AND (c = 2)) OR ((b = 5) AND (c = 2))) AND (c = 2)) Heap Blocks: exact=5406 -> Bitmap Index Scan on test_a_b_c_idx (cost=0.00..824.00 rows=2100 width=0) (actual time=65.272..65.272 rows=21645 loops=1) Index Cond: a = 3) AND (c = 2)) OR ((b = 5) AND (c = 2))) AND (c = 2)) Planning time: 0.200 ms Execution time: 77.206 ms (7 rows) Shouldn't the index condition just be "((a = 3) AND (c = 2)) OR ((b = 5) AND (c = 2))"? Also when applying and reading the patch I noticed some minor issues/nitpick. - I get whitespace warnings from git apply when I apply the patches. - You have any insconstent style for casts: I think "(Node*)clause" should be "(Node *) clause". - Same with pointers. "List* quals" should be "List *quals" - I am personally not a fan of seeing the "isorderby == false && index->rd_amroutine->amcanorclause" clause twice. Feels like a risk for diverging code paths. But it could be that there is no clean alternative. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
Hi Teodor, I've looked into v2 of the patch you sent a few days ago. Firstly, I definitely agree that being able to use OR conditions with an index is definitely a cool idea. I do however agree with David that the patch would definitely benefit from comments documenting various bits that are less obvious to mere mortals like me, with limited knowledge of the index internals. I also wonder whether the patch should add explanation of OR-clauses handling into the READMEs in src/backend/access/* The patch would probably benefit from transforming it into a patch series - one patch for the infrastructure shared by all the indexes, then one patch per index type. That should make it easier to review, and I seriously doubt we'd want to commit this in one huge chunk anyway. Now, some review comments from eyeballing the patch. Some of those are nitpicking, but well ... 1) fields in BrinOpaque are not following the naming convention (all the existing fields start with bo_) 2) there's plenty of places violating the usual code style (e.g. for single-command if branches) - not a big deal for WIP patch, but needs to get fixed eventually 3) I wonder whether we really need both SK_OR and SK_AND, considering they are mutually exclusive. Why not to assume SK_AND by default, and only use SK_OR? If we really need them, perhaps an assert making sure they are not set at the same time would be appropriate. 4) scanGetItem is a prime example of the "badly needs comments" issue, particularly because the previous version of the function actually had quite a lot of them while the new function has none. 5) scanGetItem() may end up using uninitialized 'cmp' - it only gets initialized when (!leftFinished && !rightFinished), but then gets used when either part of the condition evaluates to true. Probably should be if (!leftFinished || !rightFinished) cmp = ... 6) the code in nodeIndexscan.c should not include call to abort() { abort(); elog(ERROR, "unsupported indexqual type: %d", (int) nodeTag(clause)); } 7) I find it rather ugly that the paths are built by converting BitmapOr paths. Firstly, it means indexes without amgetbitmap can't benefit from this change. Maybe that's reasonable limitation, though? But more importantly, this design already has a bunch of unintended consequences. For example, the current code completely ignores enable_indexscan setting, because it merely copies the costs from the bitmap path. SET enable_indexscan = off; EXPLAIN SELECT * FROM t WHERE (c && ARRAY[1] OR c && ARRAY[2]); QUERY PLAN --- Index Scan using t_c_idx on t (cost=0.00..4.29 rows=0 width=33) Index Cond: ((c && '{1}'::integer[]) OR (c && '{2}'::integer[])) (2 rows) That's pretty dubious, I guess. So this code probably needs to be made aware of enable_indexscan - right now it entirely ignores startup_cost in convert_bitmap_path_to_index_clause(). But of course if there are multiple IndexPaths, the enable_indexscan=off will be included multiple times. 9) This already breaks estimation for some reason. Consider this example, using a table with int[] column, with gist index built using intarray: EXPLAIN SELECT * FROM t WHERE (c && ARRAY[1,2,3,4,5,6,7]); QUERY PLAN Index Scan using t_c_idx on t (cost=0.28..52.48 rows=12 width=33) Index Cond: (c && '{1,2,3,4,5,6,7}'::integer[]) (2 rows) EXPLAIN SELECT * FROM t WHERE (c && ARRAY[8,9,10,11,12,13,14]); QUERY PLAN Index Scan using t_c_idx on t (cost=0.28..44.45 rows=10 width=33) Index Cond: (c && '{8,9,10,11,12,13,14}'::integer[]) (2 rows) EXPLAIN SELECT * FROM t WHERE (c && ARRAY[1,2,3,4,5,6,7]) OR (c && ARRAY[8,9,10,11,12,13,14]); QUERY PLAN Index Scan using t_c_idx on t (cost=0.00..4.37 rows=0 width=33) Index Cond: ((c && '{1,2,3,4,5,6,7}'::integer[]) OR (c && '{8,9,10,11,12,13,14}'::integer[])) (2 rows) So the OR-clause is estimated to match 0 rows, less than each of the clauses independently. Needless to say that without the patch this works just fine. 10) Also, this already breaks some regression tests, apparently because it changes how 'width' is computed. So I think this way of building the index path from a BitmapOr path is pretty much a dead-end. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
Thank you for review! I'd like to see comments too! but more so in the code. :) I've had a look over this, and it seems like a great area in which we could improve on, and your reported performance improvements are certainly very interesting too. However I'm finding the code rather hard to follow, which might be a combination of my lack of familiarity with the index code, but more likely it's the lack of I've added comments, fixed a found bugs. comments to explain what's going on. Let's just take 1 function as an example: Here there's not a single comment, so I'm just going to try to work out what's going on based on the code. +static void +compileScanKeys(IndexScanDesc scan) +{ +GISTScanOpaqueso = (GISTScanOpaque) scan->opaque; +int*stack, +stackPos = -1, +i; + +if (scan->numberOfKeys <= 1 || so->useExec == false) +return; + +Assert(scan->numberOfKeys >=3); Why can numberOfKeys never be 2? I looked at what calls this and I can't really Because here they are actually an expression, expression could contain 1 or tree or more nodes but could not two (operation AND/OR plus two arguments) work it out. I'm really also not sure what useExec means as there's no comment fixed. If useExec == false then SkanKeys are implicitly ANDed and stored in just array. in that struct member, and what if numberOfKeys == 1 and useExec == false, won't this Assert() fail? If that's not a possible situation then why not? fixed +ScanKey key = scan->keyData + i; Is there a reason not to use keyData[i]; ? That's the same ScanKey key = >keyData[i]; I prefer first form as more clear but I could be wrong - but there are other places in code where pointer arithmetic is used. +if (stackPos >= 0 && (key->sk_flags & (SK_OR | SK_AND))) +{ +Assert(stackPos >= 1 && stackPos < scan->numberOfKeys); stackPos >= 1? This seems unnecessary and confusing as the if test surely makes that impossible. + +so->leftArgs[i] = stack[stackPos - 1]; Something is broken here as stackPos can be 0 (going by the if() not the Assert()), therefore that's stack[-1]. fixed stackPos is initialised to -1, so this appears to always skip the first element of the keyData array. If that's really the intention, then wouldn't it be better to just make the initial condition of the for() look i = 1 ? done I'd like to review more, but it feels like a job that's more difficult than it needs to be due to lack of comments. Would it be possible to update the patch to try and explain things a little better? Hope, I made cleaner.. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ index_or-2.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
I think this is very exciting stuff, but since you didn't submit an updated patch after David's review, I'm closing it for now as returned-with-feedback. Please submit a new version once you have it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
On 27 December 2015 at 07:04, Teodor Sigaevwrote: > I'd like to present OR-clause support for indexes. Although OR-clauses > could be supported by bitmapOR index scan it isn't very effective and such > scan lost any order existing in index. We (with Alexander Korotkov) > presented results on Vienna's conference this year. In short, it provides > performance improvement: > > EXPLAIN ANALYZE > SELECT count(*) FROM tst WHERE id = 5 OR id = 500 OR id = 5000; > me=0.080..0.267 rows=173 loops=1) > Recheck Cond: ((id = 5) OR (id = 500) OR (id = 5000)) > Heap Blocks: exact=172 > -> Bitmap Index Scan on idx_gin (cost=0.00..57.50 rows=15000 > width=0) (actual time=0.059..0.059 rows=147 loops=1) >Index Cond: ((id = 5) OR (id = 500) OR (id = 5000)) > Planning time: 0.077 ms > Execution time: 0.308 ms <--- > QUERY PLAN > > --- > Aggregate (cost=51180.53..51180.54 rows=1 width=0) (actual > time=796.766..796.766 rows=1 loops=1) >-> Index Only Scan using idx_btree on tst (cost=0.42..51180.40 > rows=55 width=0) (actual time=0.444..796.736 rows=173 loops=1) > Filter: ((id = 5) OR (id = 500) OR (id = 5000)) > Rows Removed by Filter: 999829 > Heap Fetches: 102 > Planning time: 0.087 ms > Execution time: 796.798 ms <-- > QUERY PLAN > > - > Aggregate (cost=21925.63..21925.64 rows=1 width=0) (actual > time=160.412..160.412 rows=1 loops=1) >-> Seq Scan on tst (cost=0.00..21925.03 rows=237 width=0) (actual > time=0.535..160.362 rows=175 loops=1) > Filter: ((id = 5) OR (id = 500) OR (id = 5000)) > Rows Removed by Filter: 999827 > Planning time: 0.459 ms > Execution time: 160.451 ms > > > It also could work together with KNN feature of GiST and in this case > performance improvement could be up to several orders of magnitude, in > artificial example it was 37000 times faster. > > Not all indexes can support oR-clause, patch adds support to GIN, GiST > and BRIN indexes. pg_am table is extended for adding amcanorclause column > which indicates possibility of executing of OR-clause by index. > > indexqual and indexqualorig doesn't contain implicitly-ANDed list of > index qual expressions, now that lists could contain OR RestrictionInfo. > Actually, the patch just tries to convert BitmapOr node to IndexScan or > IndexOnlyScan. Thats significantly simplifies logic to find possible > clause's list for index. > Index always gets a array of ScanKey but for indexes which support > OR-clauses > array of ScanKey is actually exection tree in reversed polish notation > form. Transformation is done in ExecInitIndexScan(). > > The problems on the way which I see for now: > 1 Calculating cost. Right now it's just a simple transformation of costs > computed for BitmapOr path. I'd like to hope that's possible and so index's > estimation function could be non-touched. So, they could believe that all > clauses are implicitly-ANDed > 2 I'd like to add such support to btree but it seems that it should be a > separated patch. Btree search algorithm doesn't use any kind of stack of > pages and algorithm to walk over btree doesn't clear for me for now. > 3 I could miss some places which still assumes implicitly-ANDed list of > clauses although regression tests passes fine. > > Hope, hackers will not have an strong objections to do that. But obviously > patch > requires further work and I'd like to see comments, suggestions and > recommendations. Thank you. Hi, I'd like to see comments too! but more so in the code. :) I've had a look over this, and it seems like a great area in which we could improve on, and your reported performance improvements are certainly very interesting too. However I'm finding the code rather hard to follow, which might be a combination of my lack of familiarity with the index code, but more likely it's the lack of comments to explain what's going on. Let's just take 1 function as an example: Here there's not a single comment, so I'm just going to try to work out what's going on based on the code. +static void +compileScanKeys(IndexScanDesc scan) +{ + GISTScanOpaque so = (GISTScanOpaque) scan->opaque; + int *stack, + stackPos = -1, + i; + + if (scan->numberOfKeys <= 1 || so->useExec == false) + return; + + Assert(scan->numberOfKeys >=3); Why can numberOfKeys never be 2? I looked at what calls this and I can't really work it out. I'm really also not sure what useExec means as there's no comment in that struct member, and what if numberOfKeys == 1 and useExec == false, won't this Assert() fail? If that's not a possible situation then why not?
Re: [HACKERS] POC, WIP: OR-clause support for indexes
This is great. I got a question, is it possible make btree index to support OR as well? Is btree supports more invasive, in the sense that we need to do enhance ScanKey to supports an array of values? Btree now works by follow: find the max/min tuple which satisfies condtions and then executes forward/backward scan over leaf pages. For complicated clauses it's not obvious how to find min/max tuple. Scanning whole index isn't an option from preformance point of view. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POC, WIP: OR-clause support for indexes
Hi, Teodor, This is great. I got a question, is it possible make btree index to support OR as well? Is btree supports more invasive, in the sense that we need to do enhance ScanKey to supports an array of values? Thanks, Feng On Sat, Dec 26, 2015 at 10:04 AM, Teodor Sigaevwrote: > I'd like to present OR-clause support for indexes. Although OR-clauses > could be supported by bitmapOR index scan it isn't very effective and such > scan lost any order existing in index. We (with Alexander Korotkov) > presented results on Vienna's conference this year. In short, it provides > performance improvement: > > EXPLAIN ANALYZE > SELECT count(*) FROM tst WHERE id = 5 OR id = 500 OR id = 5000; > me=0.080..0.267 rows=173 loops=1) > Recheck Cond: ((id = 5) OR (id = 500) OR (id = 5000)) > Heap Blocks: exact=172 > -> Bitmap Index Scan on idx_gin (cost=0.00..57.50 rows=15000 > width=0) (actual time=0.059..0.059 rows=147 loops=1) >Index Cond: ((id = 5) OR (id = 500) OR (id = 5000)) > Planning time: 0.077 ms > Execution time: 0.308 ms <--- > QUERY PLAN > > --- > Aggregate (cost=51180.53..51180.54 rows=1 width=0) (actual > time=796.766..796.766 rows=1 loops=1) >-> Index Only Scan using idx_btree on tst (cost=0.42..51180.40 > rows=55 width=0) (actual time=0.444..796.736 rows=173 loops=1) > Filter: ((id = 5) OR (id = 500) OR (id = 5000)) > Rows Removed by Filter: 999829 > Heap Fetches: 102 > Planning time: 0.087 ms > Execution time: 796.798 ms <-- > QUERY PLAN > > - > Aggregate (cost=21925.63..21925.64 rows=1 width=0) (actual > time=160.412..160.412 rows=1 loops=1) >-> Seq Scan on tst (cost=0.00..21925.03 rows=237 width=0) (actual > time=0.535..160.362 rows=175 loops=1) > Filter: ((id = 5) OR (id = 500) OR (id = 5000)) > Rows Removed by Filter: 999827 > Planning time: 0.459 ms > Execution time: 160.451 ms > > > It also could work together with KNN feature of GiST and in this case > performance improvement could be up to several orders of magnitude, in > artificial example it was 37000 times faster. > > Not all indexes can support oR-clause, patch adds support to GIN, GiST > and BRIN indexes. pg_am table is extended for adding amcanorclause column > which indicates possibility of executing of OR-clause by index. > > indexqual and indexqualorig doesn't contain implicitly-ANDed list of > index qual expressions, now that lists could contain OR RestrictionInfo. > Actually, the patch just tries to convert BitmapOr node to IndexScan or > IndexOnlyScan. Thats significantly simplifies logic to find possible > clause's list for index. > Index always gets a array of ScanKey but for indexes which support > OR-clauses > array of ScanKey is actually exection tree in reversed polish notation > form. Transformation is done in ExecInitIndexScan(). > > The problems on the way which I see for now: > 1 Calculating cost. Right now it's just a simple transformation of costs > computed for BitmapOr path. I'd like to hope that's possible and so index's > estimation function could be non-touched. So, they could believe that all > clauses are implicitly-ANDed > 2 I'd like to add such support to btree but it seems that it should be a > separated patch. Btree search algorithm doesn't use any kind of stack of > pages and algorithm to walk over btree doesn't clear for me for now. > 3 I could miss some places which still assumes implicitly-ANDed list of > clauses although regression tests passes fine. > > Hope, hackers will not have an strong objections to do that. But obviously > patch > requires further work and I'd like to see comments, suggestions and > recommendations. Thank you. > > > -- > Teodor Sigaev E-mail: teo...@sigaev.ru >WWW: > http://www.sigaev.ru/ > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >