Thanks a ton, Robert.
I checked out the latest nightly and changed the following in my
solrconfig.xml:
luceneMatchVersionLUCENE_33/luceneMatchVersion
to
luceneMatchVersionLUCENE_40/luceneMatchVersion
The new SynonymFilter loaded all the 1.9 million lines of synonyms in less
than 5 minutes! Awesome!
Thanks to all who developed this huper duper fast synonym filter!
On Thu, Aug 4, 2011 at 5:01 PM, Robert Muir rcm...@gmail.com wrote:
https://issues.apache.org/jira/browse/LUCENE-3233
On Thu, Aug 4, 2011 at 7:24 PM, Arun Atreya my.2.pai...@gmail.com wrote:
Hello,
I would like to know the best way to load a huge synonym list into Solr.
I would like to do concept indexing (a.k.a category indexing) with Solr.
For
example, I want to be able to index all cities and be able to search for
all
of them using a special keyword, say 'CONCEPTcity', where 'CONCEPTcity'
will
match anything that IS-A city, as specified in the index_synonyms.txt
file. I
believe the best way to do this is via the SynonymFilterFactory and do
index-time synonym expansion. Or is there a better alternative?
I would still like to keep the original city names and do not want to
replace them with 'CONCEPTcity', so if someone searches for 'Lake', the
city
name 'Salt Lake City' still matches. Also, obviously, I do not want two
different city names to be synonyms of each other.
Is the correct way to specify the index_synonyms.txt file like this?
-
CONCEPTcity, Salt Lake City
CONCEPTcity, New York
CONCEPTcity, San Jose
.
.
.
-
and then keep
expand=true
for SynonymFilterFactory?
I tried to load a synonym file with 10K entries like this, and Solr/Jetty
took a few seconds to start, but if I try to load a synonym file with 1M+
entries, then it is taking a long time. What is the best way to do this?
Thanks,
Arun.
--
lucidimagination.com