Resend. How to quickly compare the similarity of two tsvector? On Monday, March 2, 2015 11:01 PM, Patrick Dung <patrick_...@yahoo.com.hk> wrote:
Hello, I had a database with articles or attachment stored in bytea format.I also had a trigger: it insert/update the tsv column when a record is added/updated.The tsv column had a GIN index.With this setting, I can do very fast keyword search on the tsv. Suppose I had a specific record (id=100000).How to list similar records based on ranking?In that case, I had to compare a tsvector with another tsvector. I had this SQL which make the original tsv as a text and then to tsquery, Then I can compare a tsv and a tsquery. SELECT ts_rank(i.tsv, replace(strip(original.tsv)::text, ' ', '|')::tsquery) as similarity, i.company, i.industry, i.post_timestamp, i.id FROM items i, (SELECT tsv, id FROM items WHERE id=100000) AS original WHERE i.id != original.id ORDER BY similarity; items table:id bigint company varchar industry varchardescription varcharpost_timestamp timestampattachment bytea tsv tsvector The problem is that this is very slow.Any comment? Thank and regards,Patrick