> > Maybe there's some hybrid type possible where you can scan the index to > find large table regions that are known /not/ to contain tuples of interest > and seek over them in your scan. I wouldn't know, really, but it sounds like > it'd probably be more I/O than a pure seq scan (given the reading of the > index too) unless the table had the values of interest rather neatly > clustered. It'd also surely use more memory and CPU time processing the > whole index to find table regions without values of interest.
> > Is that what you meant, though? Not exactly. I mean the following: there are cases when index scan even over non-clustered values is a complete win (basically, it is a win when the number of returned values is relatively small no matter is it due to selectivity or due to limit clause). The test case that I have provided creates a 667 pages long table and 30 pages long index thus a complete scan of the index is 22 times faster in terms of I/O. Suppose you want to find all the values that contain '%123%'. Currently PostgreSQL will do a sec scan, while the better option might be (and it is) to loop through all the items in the index (it will cost 30 I/O), find records that truly contain %123% (it will find 20 of them) and do 20 I/O to check tuple visiblity. That is 50 I/O versus 667 for seq scan. > A b-tree index cannot be used on a LIKE query with a leading wildcard. See > the FAQ. Unfortunately it is true. I would love to improve that particular case. In addition, if your database is not in the C locale you can't use an > ordinary index for LIKE queries. See the FAQ. You need to create a > text_pattern_ops index instead: > > create index i_ix_txt on seq_test(i text_pattern_ops); Good catch. However, that does not change the results. PostgresSQL does the same amount of 2529 I/O for index scan on '%123%' for some unknown reason. > > > set enable_seqscan=off >> -- Index Scan reads 2529 pages for some reason. I would expect *30 *(index >> size) + *20 *(number of matching entries) = 50 pages maximum, that is 10 >> times better than with seq scan. >> Index Scan using i_ix on seq_test (cost=0.00..1643.74 rows=356 width=508) >> (actual time=0.334..16.746 rows=*20 *loops=1 read_shared=2529(2529) >> read_local=0(0) flush=0 local_flush=0 file_read=0 file_write=0) >> Filter: (i ~~ '%123%'::text) >> Total runtime: 16.863 ms >> > > I think it's reading the whole index, because it can't do a prefix search > if there's a leading wildcard. I'm a bit confused, though, since I thought > in this case it couldn't actually execute the query w/o a sequential scan, > and would just use one irrespective of the enable_seqscan param. That's what > happens here. Please, follow the case carefully: the index is only 30 pages long. Why is PostgreSQL doing 2529 I/O? It drives me crazy. Regards, Vladimir Sitnikov