Try the pg_tgrm extension. It is a rich set of operators.
Regards, Ninad Shah On Thu, 2 Sept 2021 at 23:39, balasubramanian c r <crbs.sie...@gmail.com> wrote: > HI Ninad > > Thanks for your reply. > If bitmap index should not be used. Do i need to disable it for the time > being and carry out the test. > > The documentation in pg_similarity shows that index can be created on text > column using gin with gin_similarity_ops. > The same way the index is created like > CREATE INDEX on address using GIN(complete_address gin_similarity_ops); > > AFAIK I have not seen any other operators other than gin in the > pg_smilarity extension. > > Thanks > C.R.Bala > > On Thu, Sep 2, 2021 at 8:17 PM Ninad Shah <nshah.postg...@gmail.com> > wrote: > >> Hi Bala, >> >> Are your statistics updated? Also, have you used the gin >> operator(gin_similarity_ops) correctly? >> >> It is fetching just 6 records out of a million, hence, it should not go >> for bitmap index scan. As bitmap index scan loads a complete index, and >> access relevant pages from the table later by bitmap heap scan. >> >> >> Regards, >> Ninad Shah >> >> >> On Thu, 2 Sept 2021 at 16:39, balasubramanian c r <crbs.sie...@gmail.com> >> wrote: >> >>> Hi Team >>> >>> We have encountered a problem in our testing environment. >>> I have a scenario where I am running a similarity match for an address >>> I have created a table with following number of records >>> 1603423 >>> >>> We are using pg_similarity extension in postgresql version is 13. >>> >>> And I have created GIN index (since i am using pg_similarity) library >>> jaccard similarity method >>> when I run the Explain analyze >>> EXPLAIN ANALYZE select complete_address, jaccard(complete_address, 'raj >>> nagar ext near ajnara integrity up ghaziabad 201017') as qsim from address >>> where complete_address~??'raj nagar ext near ajnara integrity up ghaziabad >>> 201017' order by qsim DESC; >>> QUERY PLAN >>> >>> Sort (cost=5856.13..5860.14 rows=1603 width=93) (actual >>> time=12101.194..12101.197 rows=6 loops=1) >>> >>> Sort Key: (jaccard(complete_address, 'raj nagar ext near ajnara >>> integrity up ghaziabad 201017'::text)) DESC >>> Sort Method: quicksort Memory: 25kB >>> -> Bitmap Heap Scan on address (cost=172.43..5770.80 rows=1603 width=93) >>> (actual time=3516.233..12101.172 rows=6 loops=1) >>> Recheck Cond: (complete_address ~?? 'raj nagar ext near ajnara integrity >>> up ghaziabad 201017'::text) >>> Rows Removed by Index Recheck: 1039186 >>> Heap Blocks: exact=58018 >>> -> Bitmap Index Scan on address_complete_address_idx1 (cost=0.00..172.02 >>> rows=1603 width=0) (actual time=256.037..256.037 rows=1039192 loops=1) >>> Index Cond: (complete_address ~?? 'raj nagar ext near ajnara integrity >>> up ghaziabad 201017'::text) >>> Planning Time: 0.141 ms >>> Execution Time: 12101.245 ms >>> (11 rows) >>> it took 12 seconds >>> following are my pgconf file settings >>> shared buffer as 4GB >>> work_mem 256 MB >>> maintenence_work_mem 512MB >>> autovacuum_work_mem 20MB >>> My index definition is this "address_complete_address_idx1" gin >>> (complete_address gin_similarity_ops) >>> >>> It is taking the index correctly. But why it took 12 seconds to process >>> I really don't understand. >>> >>> Please help. >>> >>> Thanks >>> C.R.Bala >>> >>