It shouldn't matter. Btw try a url instead of a file path. I think the underlying loading mechanism uses java File , it could work. On May 4, 2015 2:07 AM, "Zheng Lin Edwin Yeo" <edwinye...@gmail.com> wrote:
> Would like to check, will this method of splitting the synonyms into > multiple files use up a lot of memory? > > I'm trying it with about 10 files and that collection is not able to be > loaded due to insufficient memory. > > Although currently my machine only have 4GB of memory, but I only have > 500,000 records indexed, so not sure if there's a significant impact in the > future (even with larger memory) when my index grows and other things like > faceting, highlighting, and carrot tools are implemented. > > Regards, > Edwin > > > > On 1 May 2015 at 11:08, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > > > Thank you for the info. Yup this works. I found out that we can't load > > files that are more than 1MB into zookeeper, as it happens to any files > > that's larger than 1MB in size, not just the synonyms files. > > But I'm not sure if there will be an impact to the system, as the number > > of synonym text file can potentially grow up to more than 20 since my > > sample synonym file size is more than 20MB. > > > > Currently I only have less than 500,000 records indexed in Solr, so not > > sure if there will be a significant impact as compared to one which has > > millions of records. > > Will try to get more records indexed and will update here again. > > > > Regards, > > Edwin > > > > > > On 1 May 2015 at 08:17, Philippe Soares <soa...@genomequest.com> wrote: > > > >> Split your synonyms into multiple files and set the SynonymFilterFactory > >> with a coma-separated list of files. e.g. : > >> synonyms="syn1.txt,syn2.txt,syn3.txt" > >> > >> On Thu, Apr 30, 2015 at 8:07 PM, Zheng Lin Edwin Yeo < > >> edwinye...@gmail.com> > >> wrote: > >> > >> > Just to populate it with the general synonym words. I've managed to > >> > populate it with some source online, but is there a limit to what it > can > >> > contains? > >> > > >> > I can't load the configuration into zookeeper if the synonyms.txt file > >> > contains more than 2100 lines. > >> > > >> > Regards, > >> > Edwin > >> > On 1 May 2015 05:44, "Chris Hostetter" <hossman_luc...@fucit.org> > >> wrote: > >> > > >> > > > >> > > : There is a possible solution here: > >> > > : https://issues.apache.org/jira/browse/LUCENE-2347 (Dump WordNet > to > >> > SOLR > >> > > : Synonym format). > >> > > > >> > > If you have WordNet synonyms you do't need any special code/tools to > >> > > convert them -- the current solr.SynonymFilterFactory supports > wordnet > >> > > files (just specify format="wordnet") > >> > > > >> > > > >> > > : > > Does anyone knows any faster method of populating the > >> synonyms.txt > >> > > file > >> > > : > > instead of manually typing in the words into the file, which > >> there > >> > > could > >> > > : > be > >> > > : > > thousands of synonyms around? > >> > > > >> > > populate from what? what is hte source of your data? > >> > > > >> > > the default solr synonym file format is about as simple as it could > >> > > possibly be -- pretty trivial to generate it from scripts -- the > hard > >> > part > >> > > is usually selecting the synonym data you want to use and parsing > >> > whatever > >> > > format it is already in. > >> > > > >> > > > >> > > > >> > > -Hoss > >> > > http://www.lucidworks.com/ > >> > > > >> > > >> > > > > >