HI Ninad Thanks for your reply. If bitmap index should not be used. Do i need to disable it for the time being and carry out the test.
The documentation in pg_similarity shows that index can be created on text column using gin with gin_similarity_ops. The same way the index is created like CREATE INDEX on address using GIN(complete_address gin_similarity_ops); AFAIK I have not seen any other operators other than gin in the pg_smilarity extension. Thanks C.R.Bala On Thu, Sep 2, 2021 at 8:17 PM Ninad Shah <nshah.postg...@gmail.com> wrote: > Hi Bala, > > Are your statistics updated? Also, have you used the gin > operator(gin_similarity_ops) correctly? > > It is fetching just 6 records out of a million, hence, it should not go > for bitmap index scan. As bitmap index scan loads a complete index, and > access relevant pages from the table later by bitmap heap scan. > > > Regards, > Ninad Shah > > > On Thu, 2 Sept 2021 at 16:39, balasubramanian c r <crbs.sie...@gmail.com> > wrote: > >> Hi Team >> >> We have encountered a problem in our testing environment. >> I have a scenario where I am running a similarity match for an address >> I have created a table with following number of records >> 1603423 >> >> We are using pg_similarity extension in postgresql version is 13. >> >> And I have created GIN index (since i am using pg_similarity) library >> jaccard similarity method >> when I run the Explain analyze >> EXPLAIN ANALYZE select complete_address, jaccard(complete_address, 'raj >> nagar ext near ajnara integrity up ghaziabad 201017') as qsim from address >> where complete_address~??'raj nagar ext near ajnara integrity up ghaziabad >> 201017' order by qsim DESC; >> QUERY PLAN >> >> Sort (cost=5856.13..5860.14 rows=1603 width=93) (actual >> time=12101.194..12101.197 rows=6 loops=1) >> >> Sort Key: (jaccard(complete_address, 'raj nagar ext near ajnara integrity >> up ghaziabad 201017'::text)) DESC >> Sort Method: quicksort Memory: 25kB >> -> Bitmap Heap Scan on address (cost=172.43..5770.80 rows=1603 width=93) >> (actual time=3516.233..12101.172 rows=6 loops=1) >> Recheck Cond: (complete_address ~?? 'raj nagar ext near ajnara integrity >> up ghaziabad 201017'::text) >> Rows Removed by Index Recheck: 1039186 >> Heap Blocks: exact=58018 >> -> Bitmap Index Scan on address_complete_address_idx1 (cost=0.00..172.02 >> rows=1603 width=0) (actual time=256.037..256.037 rows=1039192 loops=1) >> Index Cond: (complete_address ~?? 'raj nagar ext near ajnara integrity up >> ghaziabad 201017'::text) >> Planning Time: 0.141 ms >> Execution Time: 12101.245 ms >> (11 rows) >> it took 12 seconds >> following are my pgconf file settings >> shared buffer as 4GB >> work_mem 256 MB >> maintenence_work_mem 512MB >> autovacuum_work_mem 20MB >> My index definition is this "address_complete_address_idx1" gin >> (complete_address gin_similarity_ops) >> >> It is taking the index correctly. But why it took 12 seconds to process I >> really don't understand. >> >> Please help. >> >> Thanks >> C.R.Bala >> >