Re: Using Chinese / How to ?
1: modify ur schema.xml: like fieldtype name=text_cn class=solr.TextField analyzer class=chineseAnalyzer/ analyzer 2: add your field: field name=urfield type=text_cn indexd=true stored=true/ 3: add your analyzer to {solr_dir}\lib\ 4: rebuild newsolr and u will find it in {solr_dir}\dist 5: follow tutorial to setup solr 6: open your browser to solr admin page, find analyzer to check analyzer, it will tell u how to analyzer world, use which analyzer -- regards j.L ( I live in Shanghai, China)
Re: Dismax handler phrase matching question
On Wed, Jun 3, 2009 at 1:59 AM, anuvenk anuvenkat...@hotmail.com wrote: I have to search over multiple fields so passing everything in the 'q' might not be neat. Can something be done with the facet.query to accomplish this. I'm using the facet parameters. I'm not familiar with java so not sure if a function query could be used to accomplish this. Any other thoughts? I don't think facet.query and function queries have anything to do with this. Using the dismax params seem to be the right way. -- Regards, Shalin Shekhar Mangar.
Re: Phrase query search returns no result
Yes, Erick, I did. Actually the course of events was as follows. I started with the example config files (solrconfig.xml schema.xml) and added my own fields. In my search I have 2 clauses: for a phrase and for a set of keywords. And from the very beginning it worked fine. Until on the second day one phrase (It was as long as a tree) gave me back the wrong response. Trying to find the reason I started changing different parameters one by one (field types - from text to string and back, copyfields, analyzers, etc.). The result - I came to the situation when all the queries returned only wrong responses. During my research I deleted all indexed xml files several times what, in theory, should have cleaned up the index itself (as I understand it). And then I decided to start all over again. The only two differences from the very beginning was that I turned the StopWordsFilter off (although I did it several times while playing with params; besides, the phrase that initially caused troubles doesn't consists only of the stop words) and also, I commented out copyField declarations for my own fields. I'm still wondering what happened. Thank you, Sergey Erick Erickson wrote: Did you by any chance change your schema? Rename a field? Change your analyzers? etc? between the time you originally generated your index and blowing it away? I'm wondering if blowing away your index and regenerating just caused any changes in how you index/search to get picked up... Best Erick On Tue, Jun 2, 2009 at 3:28 PM, SergeyG sgoldb...@mail.ru wrote: Hmmm... It looks a bit magic. After 3 days of experimenting with various parameters and getting only wrong results, I deleted all the indexed data and left the minimum set of parameters: qs=default (I omitted it), StopWords=off (StopWordsFilter was commented out), no copyFields, requestHandler=standard. And guess what - it started producing the expected results! :) So for me the question remains: what was the cause of all the previous trouble? Anyway, thanks for the discussion. SergeyG wrote: Actually, my phrase here~0 (for an exact match) didn't work I tried, just for to experiment, to put qs=100. Otis Gospodnetic wrote: And your phrase here~100 works? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: SergeyG sgoldb...@mail.ru To: solr-user@lucene.apache.org Sent: Tuesday, June 2, 2009 11:17:23 AM Subject: Re: Phrase query search returns no result Thanks, Otis. Checking for the stop words was the first thing I did after getting the empty result. Not all of those words are in the stopwords.txt file. Then just for experimenting purposes I commented out the StopWordsAnalyser during indexing and reindexed. But the phrase was not found again. Sergey Otis Gospodnetic wrote: Your stopwords were removed during indexing, so if all those terms were stopwords, and they likely were, none of them exist in the index now. You can double-check that with Luke. You need to remove stopwords from the index-time analyzer, too, and then reindex. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: SergeyG To: solr-user@lucene.apache.org Sent: Tuesday, June 2, 2009 9:57:17 AM Subject: Phrase query search returns no result Hi, I'm trying to implement a full-text search but can't get the right result with a Phrase query search. The field I search through was indexed as a text field. The phrase was It was as long as a tree. During both indexing and searching the StopWordsFiler was on. For a search I used these settings: dismax explicit title author category content id,title,author,isbn,category,content,score 100 content But I the returned docs list was empty. Using Solr Admin console for debugging showed that parsedquery=+() (). Switching the StopwordsFilter off during searching didn't help either. Am I missing something? Thanks, Sergey -- View this message in context: http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23834693.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23839134.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23846362.html Sent from the Solr - User mailing list archive at Nabble.com.
How contrib for solr memcache query cache
Hi all: I want to contrib memcache implement solr cache (only test query result cache) patch for solr 1.3 http://code.google.com/p/solr-side/issues/detail?id=1can=1 solr-memcache.zip http://solr-side.googlecode.com/files/solr-memcache.zip =readme.txt= MemcachedCache instead of solr queryresultCache (default LRUCache) config in solrconfig.xml to use solr-memcache add newSearcher and firstSearcher Listener, such as: listener event=newSearcher class=solr.MemcachedCache / listener event=firstSearcher class=solr.MemcachedCache / use listener only for get index version, to create memcached key indexVersion is static long field of MemcachedCache.java. //originalKey is QueryResultKey memcached key = keyPrefix+indexVersion+-+originalKey.hashCode() !-- MemcachedCache params: memcachedHosts (required), , split. name (optional) no default. expTime (optional) default 1800 s (= 30 minute) defaultPort (optional) default 11211 keyPrefix (optional) default -- queryResultCache class=solr.MemcachedCache memcachedHosts=192.168.0.100,192.168.0.101:1234,192.168.0.103 expTime=21600 defaultPort=11511 keyPrefix=shard-1-/ dep jar: memcached-2.2.jar spy-2.4.jar solr-memcache.patch for solr 1.3 if download and unzip to d:/apache-solr-1.3.0 copy patch-build.xml and solr-memcache.patch to (d:/apache-solr-1.3.0) D:\apache-solr-1.3.0ant -f patch-build.xml -Dpatch.file=solr-memcache.patch Buildfile: patch-build.xml apply-patch: [patch] patching file src/java/org/apache/solr/search/DocSet.java BUILD SUCCESSFUL Total time: 0 seconds if exist d:/apache-solr-1.3.0/contrib/solr-memcache, if no exist you can unzip solr-memcache.zip to that dir and dist D:\apache-solr-1.3.0ant dist ... look D:\apache-solr-1.3.0\dist\apache-solr-memcache-1.3.0.jar ___ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/
Re: How contrib for solr memcache query cache
please raise this as an issue in Jira https://issues.apache.org/jira/browse/SOLR let us see what others think about this On Wed, Jun 3, 2009 at 1:14 PM, chenl...@yahoo.com.cn wrote: Hi all: I want to contrib memcache implement solr cache (only test query result cache) patch for solr 1.3 http://code.google.com/p/solr-side/issues/detail?id=1can=1 solr-memcache.zip http://solr-side.googlecode.com/files/solr-memcache.zip =readme.txt= MemcachedCache instead of solr queryresultCache (default LRUCache) config in solrconfig.xml to use solr-memcache add newSearcher and firstSearcher Listener, such as: listener event=newSearcher class=solr.MemcachedCache / listener event=firstSearcher class=solr.MemcachedCache / use listener only for get index version, to create memcached key indexVersion is static long field of MemcachedCache.java. //originalKey is QueryResultKey memcached key = keyPrefix+indexVersion+-+originalKey.hashCode() !-- MemcachedCache params: memcachedHosts (required), , split. name (optional) no default. expTime (optional) default 1800 s (= 30 minute) defaultPort (optional) default 11211 keyPrefix (optional) default -- queryResultCache class=solr.MemcachedCache memcachedHosts=192.168.0.100,192.168.0.101:1234,192.168.0.103 expTime=21600 defaultPort=11511 keyPrefix=shard-1-/ dep jar: memcached-2.2.jar spy-2.4.jar solr-memcache.patch for solr 1.3 if download and unzip to d:/apache-solr-1.3.0 copy patch-build.xml and solr-memcache.patch to (d:/apache-solr-1.3.0) D:\apache-solr-1.3.0ant -f patch-build.xml -Dpatch.file=solr-memcache.patch Buildfile: patch-build.xml apply-patch: [patch] patching file src/java/org/apache/solr/search/DocSet.java BUILD SUCCESSFUL Total time: 0 seconds if exist d:/apache-solr-1.3.0/contrib/solr-memcache, if no exist you can unzip solr-memcache.zip to that dir and dist D:\apache-solr-1.3.0ant dist ... look D:\apache-solr-1.3.0\dist\apache-solr-memcache-1.3.0.jar ___ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/ -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: fq vs. q
It's definitely not proper documentation but maybe can give you a hand: http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-and-throughput/ Martin Davidsson-2 wrote: I've tried to read up on how to decide, when writing a query, what criteria goes in the q parameter and what goes in the fq parameter, to achieve optimal performance. Is there some documentation that describes how each field is treated internally, or even better, some kind of rule of thumb to help me decide how to split things up when querying against one or more fields. In most cases, I'm looking for exact matches but sometimes an occasional wildcard query shows up too. Thank you! -- Martin -- View this message in context: http://www.nabble.com/fq-vs.-q-tp23845282p23847845.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: fq vs. q
wow! that was a good read!!! On Wed, Jun 3, 2009 at 2:23 PM, Marc Sturlese marc.sturl...@gmail.comwrote: It's definitely not proper documentation but maybe can give you a hand: http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-and-throughput/ Martin Davidsson-2 wrote: I've tried to read up on how to decide, when writing a query, what criteria goes in the q parameter and what goes in the fq parameter, to achieve optimal performance. Is there some documentation that describes how each field is treated internally, or even better, some kind of rule of thumb to help me decide how to split things up when querying against one or more fields. In most cases, I'm looking for exact matches but sometimes an occasional wildcard query shows up too. Thank you! -- Martin -- View this message in context: http://www.nabble.com/fq-vs.-q-tp23845282p23847845.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How contrib for solr memcache query cache
https://issues.apache.org/jira/browse/SOLR-1197 --- 09年6月3日,周三, chenl...@yahoo.com.cn chenl...@yahoo.com.cn 写道: 发件人: chenl...@yahoo.com.cn chenl...@yahoo.com.cn 主题: How contrib for solr memcache query cache 收件人: solr-user@lucene.apache.org 日期: 2009年6月3日,周三,下午3:44 Hi all: I want to contrib memcache implement solr cache (only test query result cache) patch for solr 1.3 http://code.google.com/p/solr-side/issues/detail?id=1can=1 solr-memcache.zip http://solr-side.googlecode.com/files/solr-memcache.zip =readme.txt= MemcachedCache instead of solr queryresultCache (default LRUCache) config in solrconfig.xml to use solr-memcache add newSearcher and firstSearcher Listener, such as: listener event=newSearcher class=solr.MemcachedCache / listener event=firstSearcher class=solr.MemcachedCache / use listener only for get index version, to create memcached key indexVersion is static long field of MemcachedCache.java. //originalKey is QueryResultKey memcached key = keyPrefix+indexVersion+-+originalKey.hashCode() !-- MemcachedCache params: memcachedHosts (required), , split. name (optional) no default. expTime (optional) default 1800 s (= 30 minute) defaultPort (optional) default 11211 keyPrefix (optional) default -- queryResultCache class=solr.MemcachedCache memcachedHosts=192.168.0.100,192.168.0.101:1234,192.168.0.103 expTime=21600 defaultPort=11511 keyPrefix=shard-1-/ dep jar: memcached-2.2.jar spy-2.4.jar solr-memcache.patch for solr 1.3 if download and unzip to d:/apache-solr-1.3.0 copy patch-build.xml and solr-memcache.patch to (d:/apache-solr-1.3.0) D:\apache-solr-1.3.0ant -f patch-build.xml -Dpatch.file=solr-memcache.patch Buildfile: patch-build.xml apply-patch: [patch] patching file src/java/org/apache/solr/search/DocSet.java BUILD SUCCESSFUL Total time: 0 seconds if exist d:/apache-solr-1.3.0/contrib/solr-memcache, if no exist you can unzip solr-memcache.zip to that dir and dist D:\apache-solr-1.3.0ant dist ... look D:\apache-solr-1.3.0\dist\apache-solr-memcache-1.3.0.jar ___ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/ ___ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/
indexing/crawling HTML + solr
Hi! to be short, where to start with the subject? Any pointers to some [semi-]functional solutions that crawl the web as a normal crawler, take care about html parsing, etc, and feed the crawled stuff as solr-documents per add ? regards!
Alphabetical index for faceting
Hello, My goal is to get an index for alphabetical faceting of titles. For this I'm trying to define a fieldType meant to index first letter of text, with stopwords removed. My problem is that without WordDelimiterFilterFactory stopwords are not removed, and with it I end up with 2 tokens (and I'd like to keep just the first one). For example, the string The Curse of Monkey Island should be indexed as c. Here is my field type definition as of now: fieldType name=alphabetical class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ISOLatin1AccentFilterFactory / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_fr.txt/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=([0-9a-z]).* replacement=$1 replace=all / /analyzer /fieldType With my example it gives with 3 tokens: c, m, i. I have not been able to find any documentation related to what I want to do (wrong keywords in google?). At this point I'm beginning to think that I will have to write a custom filter that would replace the patternreplacefilterfactory: it would keep the first character of the first token and discard everything else. Unfortunatly I have not programmed with java for years, so I try to avoid that solution if possible. And since I don't see my need as something as uncommon, I am wondering what I am missing. Any idea? -- Bertrand Mathieu
Re: indexing/crawling HTML + solr
Gena, Besides droids (simpler, smaller components you can put together) there is also Nutch, a bigger beast for large scale crawling that index crawled pages into Solr - http://lucene.apache.org/nutch . Otis - Original Message From: Gena Batsyan gbat...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 6:09:36 AM Subject: indexing/crawling HTML + solr Hi! to be short, where to start with the subject? Any pointers to some [semi-]functional solutions that crawl the web as a normal crawler, take care about html parsing, etc, and feed the crawled stuff as solr-documents per ? regards!
Re: How to avoid space on facet field
Anshuman, thanks for you input. I will try that, I can understand what you are trying. Marcus, I did not understand how your KeyworkTokenizer work. Is that I have to define a septate field like what we have in example schema and call that field. This what I came up with. fieldType name=facet_tex class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / !-- The TrimFilter removes any leading or trailing whitespace -- filter class=solr.TrimFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=([^a-z]) replacement= replace=all / /analyzer /fieldType Thanks Boney From: Marc Sturlese marc.sturl...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 3:45:49 AM Subject: Re: How to avoid space on facet field You can configure a facet_text instead of the normal text type. There you use KeyWordTokenizer instead of StandardTokenizer. One of the advantages of using it instead of string is that it will allow you to use synonyms, stopwords and filters and all the properties from an analyzer. Anshuman Manur wrote: Hey, From what you have written I'm guessing that in your schema.xml file, you have defined the field manu to be of type text, which is good for keyword searches, as the text type indexes on whitespace, i.e. Dell Inc. is indexed as dell, inc. so keyword searches matches either dell or inc. But when you want to facet on a particular field, you want exact matches regardless of whitespace in between. In such cases its a good idea to use the string type. Let me illustrate with an example based on my settings: Here are my fields: !-- Core Fields -- field name=id type=string indexed=true stored=true required=true / field name=name type=text indexed=true stored=true/ field name=manu type=text indexed=true stored=true/ field name=sport type=text indexed=true stored=true / field name=type type=text indexed=true stored=true / field name=desc type=text indexed=true stored=true / field name=ldesc type=text indexed=true stored=true / !-- default text Field for searching -- field name=text type=text indexed=true stored=false multiValued=true/ !-- exact string fields for faceting -- field name=sport_exact type=string indexed=true stored=false / field name=manu_exact type=string indexed=true stored=false / field name=type_exact type=string indexed=true stored=false / copyField source=manu dest=text/ copyField source=name dest=text/ copyField source=sport dest=text/ copyField source=desc dest=text/ copyField source=ldesc dest=text/ copyField source=type dest=text/ copyField source=manu dest=manu_exact/ copyField source=sport dest=sport_exact/ copyField source=type dest=type_exact/ So, when doing keyword searches I use the field name=text... to search in all the fields, as I copyField all the fields onto the field named text. But, for faceting I use the exact fields, which are of type string and don't split on whitespace. Anshu On Wed, Jun 3, 2009 at 1:50 AM, Bny Jo bny...@yahoo.com wrote: Hello, I am wondering why solr is returning a manufacturer name field ( Dell, Inc) as Dell one result and Inc another result. Is there a way to facet a field which have space or delimitation on them? query.addFacetField(manu); query.setFacetMinCount(1); query.setIncludeScore(true); ListFacetField facetFieldList=qr.getFacetFields(); for(FacetField facetField: facetFieldList){ System.out.println(facetField.toString() +Manufactures); } And it returns - [manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1), viewson (1), vizo (1)]] -- View this message in context: http://www.nabble.com/How-to-avoid-space-on-facet-field-tp23840037p23847742.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to avoid space on facet field
Yeah, that's the point. Once you have this, you can use copyField as was wrote above with the string example. Bny Jo wrote: Anshuman, thanks for you input. I will try that, I can understand what you are trying. Marcus, I did not understand how your KeyworkTokenizer work. Is that I have to define a septate field like what we have in example schema and call that field. This what I came up with. fieldType name=facet_tex class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / !-- The TrimFilter removes any leading or trailing whitespace -- filter class=solr.TrimFilterFactory / filter class=solr.PatternReplaceFilterFactory pattern=([^a-z]) replacement= replace=all / /analyzer /fieldType Thanks Boney From: Marc Sturlese marc.sturl...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 3:45:49 AM Subject: Re: How to avoid space on facet field You can configure a facet_text instead of the normal text type. There you use KeyWordTokenizer instead of StandardTokenizer. One of the advantages of using it instead of string is that it will allow you to use synonyms, stopwords and filters and all the properties from an analyzer. Anshuman Manur wrote: Hey, From what you have written I'm guessing that in your schema.xml file, you have defined the field manu to be of type text, which is good for keyword searches, as the text type indexes on whitespace, i.e. Dell Inc. is indexed as dell, inc. so keyword searches matches either dell or inc. But when you want to facet on a particular field, you want exact matches regardless of whitespace in between. In such cases its a good idea to use the string type. Let me illustrate with an example based on my settings: Here are my fields: !-- Core Fields -- field name=id type=string indexed=true stored=true required=true / field name=name type=text indexed=true stored=true/ field name=manu type=text indexed=true stored=true/ field name=sport type=text indexed=true stored=true / field name=type type=text indexed=true stored=true / field name=desc type=text indexed=true stored=true / field name=ldesc type=text indexed=true stored=true / !-- default text Field for searching -- field name=text type=text indexed=true stored=false multiValued=true/ !-- exact string fields for faceting -- field name=sport_exact type=string indexed=true stored=false / field name=manu_exact type=string indexed=true stored=false / field name=type_exact type=string indexed=true stored=false / copyField source=manu dest=text/ copyField source=name dest=text/ copyField source=sport dest=text/ copyField source=desc dest=text/ copyField source=ldesc dest=text/ copyField source=type dest=text/ copyField source=manu dest=manu_exact/ copyField source=sport dest=sport_exact/ copyField source=type dest=type_exact/ So, when doing keyword searches I use the field name=text... to search in all the fields, as I copyField all the fields onto the field named text. But, for faceting I use the exact fields, which are of type string and don't split on whitespace. Anshu On Wed, Jun 3, 2009 at 1:50 AM, Bny Jo bny...@yahoo.com wrote: Hello, I am wondering why solr is returning a manufacturer name field ( Dell, Inc) as Dell one result and Inc another result. Is there a way to facet a field which have space or delimitation on them? query.addFacetField(manu); query.setFacetMinCount(1); query.setIncludeScore(true); ListFacetField facetFieldList=qr.getFacetFields(); for(FacetField facetField: facetFieldList){ System.out.println(facetField.toString() +Manufactures); } And it returns - [manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1), viewson (1), vizo (1)]] -- View this message in context: http://www.nabble.com/How-to-avoid-space-on-facet-field-tp23840037p23847742.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/How-to-avoid-space-on-facet-field-tp23840037p23850245.html Sent from the Solr - User mailing list archive at Nabble.com.
Strange behaviour with copyField
I've been hitting my head against a wall all morning trying to figure this out and haven't managed to get anywhere and wondered if anybody here can help. I have defined a field type fieldType name=text_au class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.LowerCaseTokenizerFactory / /analyzer /fieldType I have two fields field name=au type=text_au indexed=true stored=true required=false multiValued=true/ field name=author type=text_au indexed=true stored=false multiValued=true/ and a copyField line copyField source=au dest=author / The idea is to allow searching for authors so a search for author:(Hobbs A.U.) will match the au field value Hobbs A. U. (notice the space). However the query au:(Hobbs A.U.) matches and the the query author:(Hobbs A.U.) does not. Any ideas? I'm using a Solr 1.4 snapshot Regards James
Re: Keyword Density
So, is there an ability to perform filtering as I described? On Mon, Jun 1, 2009 at 22:24, Alex Shevchenko caeza...@gmail.com wrote: But I don't need to sort using this value. I need to cut results, where this value (for particular term of query!) not in some range. On Mon, Jun 1, 2009 at 22:20, Walter Underwood wunderw...@netflix.comwrote: That is the normal relevance scoring formula in Solr and Lucene. It is a bit fancier than that, but you don't have to do anything special to get that behavior. Solr also uses the inverse document frequency (rarity) of each word for weighting. Look up tf.idf for more info. wunder On 6/1/09 11:46 AM, Alex Shevchenko caeza...@gmail.com wrote: Something like that. Just not ' N times' but 'numbers of foo appears/total number of words some value' On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic otis_gospodne...@yahoo.comwrote: Hi Alex, Could you please provide an example of this? Are you looking to do something like find all docs that match name:foo and where foo appears N times (in the name field) in the matching document? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alex Shevchenko caeza...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 1:32:49 PM Subject: Re: Keyword Density HI All, Is there a way to perform filtering based on keyword density? Thanks -- Alex Shevchenko -- Alex Shevchenko -- Alex Shevchenko
Re: NPE in dataimport.DebugLogger.peekStack (DIH Development Console)
This is fixed in trunk. The next nightly build will have this fix. Thanks! On Tue, Jun 2, 2009 at 9:49 PM, Steffen B. s.baumg...@fhtw-berlin.dewrote: Glad to hear that it's not a problem with my setup. Thanks for taking care of it! :) Shalin Shekhar Mangar wrote: On Tue, Jun 2, 2009 at 8:06 PM, Steffen B. s.baumg...@fhtw-berlin.dewrote: I'm trying to debug my DI config on my Solr server and it constantly fails with a NullPointerException: Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed java.lang.NullPointerException at org.apache.solr.handler.dataimport.DebugLogger.peekStack(DebugLogger.java:78) at org.apache.solr.handler.dataimport.DebugLogger.log(DebugLogger.java:98) at org.apache.solr.handler.dataimport.SolrWriter.log(SolrWriter.java:248) at... Running a normal full-import works just fine, but whenever I try to run the debugger, it gives me this error. I'm using the most recent Solr nightly build (2009-06-01) and the method in question is: private DebugInfo peekStack() { return debugStack.isEmpty() ? null : debugStack.peek(); } I'm using a DI config that has been working fine in for several previous builds, so that shouldn't be the problem... any ideas what the problem could be? A previous commit to change the EntityProcessor API broke this functionality. I'll open an issue and give a patch. -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/NPE-in-dataimport.DebugLogger.peekStack-%28DIH-Development-Console%29-tp23833878p23835897.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Strange behaviour with copyField
James, I don't see the error, but this is exactly what Solr Admin's analysis page will quickly help you with! :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: James Grant james.gr...@semantico.com To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 8:09:10 AM Subject: Strange behaviour with copyField I've been hitting my head against a wall all morning trying to figure this out and haven't managed to get anywhere and wondered if anybody here can help. I have defined a field type I have two fields multiValued=true/ multiValued=true/ and a copyField line The idea is to allow searching for authors so a search for author:(Hobbs A.U.) will match the au field value Hobbs A. U. (notice the space). However the query au:(Hobbs A.U.) matches and the the query author:(Hobbs A.U.) does not. Any ideas? I'm using a Solr 1.4 snapshot Regards James
Re: Keyword Density
I don't think this is possible without changing Solr. Or maybe it's possible with a custom Search Component that looks at all hits and checks the df (document frequency) for a term in each document? Sounds like a very costly operation... Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alex Shevchenko caeza...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 9:57:29 AM Subject: Re: Keyword Density So, is there an ability to perform filtering as I described? On Mon, Jun 1, 2009 at 22:24, Alex Shevchenko wrote: But I don't need to sort using this value. I need to cut results, where this value (for particular term of query!) not in some range. On Mon, Jun 1, 2009 at 22:20, Walter Underwood wrote: That is the normal relevance scoring formula in Solr and Lucene. It is a bit fancier than that, but you don't have to do anything special to get that behavior. Solr also uses the inverse document frequency (rarity) of each word for weighting. Look up tf.idf for more info. wunder On 6/1/09 11:46 AM, Alex Shevchenko wrote: Something like that. Just not ' N times' but ' appears/ ' On Mon, Jun 1, 2009 at 21:00, Otis Gospodnetic wrote: Hi Alex, Could you please provide an example of this? Are you looking to do something like find all docs that match name:foo and where foo appears N times (in the name field) in the matching document? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alex Shevchenko To: solr-user@lucene.apache.org Sent: Monday, June 1, 2009 1:32:49 PM Subject: Re: Keyword Density HI All, Is there a way to perform filtering based on keyword density? Thanks -- Alex Shevchenko -- Alex Shevchenko -- Alex Shevchenko
filter on millions of IDs from external query
I am working with an in index of ~10 million documents. The index does not change often. I need to preform some external search criteria that will return some number of results -- this search could take up to 5 mins and return anywhere from 0-10M docs. I would like to use the output of this long running query as a filter in solr. Any suggestions on how to wire this all together? My initial ideas (I have not implemented anything yet -- just want to check with you all before starting down the wrong path) is to: * assume the index will always be optimized, in this case every id maps to a lucene int id. * Store the results of the expensive query as a bitset. * use the stored bitset in the lucene query. I'm sure I can get this to work, but it seems kinda ugly (and brittle). Any better thoughts on how to do this? If we had some sort of external tagging interface, each document could just get tagged with what query it matches. thanks ryan
MoreLikeThis query
Hi, I'm adding the MoreLikeThis functionality to my search. 1. Do I understand it right that the query: q=id:1mlt=truemlt.fl=content will bring back documents in which the most important terms of the content field are partly the same as those of the content field of the doc with id=1? 2. Also, the full request url for the above mentioned query would be: solr_base_url/select?q=id:1mlt=truemlt.fl=content which is equivalent to the query: solr_base_url/mlt?q=id:1mlt.fl=content But while the former query would be handled by the StandardRequestHandler and executed by calling server.query(query), the latter query is handled by the MoreLikeThisRequestHandler and there is no specific method to execute it. Is this right? And if this is the case, how can the latter query be triggered? Thanks, Sergey -- View this message in context: http://www.nabble.com/MoreLikeThis-query-tp23856526p23856526.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr search by segment
Hi, I have an index in wich I am always indexing the same documents (re-indexing). So I need to search for them by their number of segment. When I ask solrj for the documents by their segment [for example: solrj.query(segment:20090603142546);] , he doesn't return any thing. I checked the schema.xml and the field segment is stored and indexed. What may I do? I am looking for your help. Thanks. -- View this message in context: http://www.nabble.com/Solr-search-by-segment-tp23856569p23856569.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Seattle / PNW Hadoop + Lucene User Group?
Hey everyone! I just wanted to give a BIG THANKS for everyone who came. We had over a dozen people, and a few got lost at UW :) [I would have sent this update earlier, but I flew to Florida the day after the meeting]. If you didn't come, you missed quite a bit of learning and topics. Such as: -Building a Social Media Analysis company on the Apache Cloud Stack -Cancer detection in images using Hadoop -Real-time OLAP -Scalable Lucene using Katta and Hadoop -Video and Network Flow -Custom Ranking in Lucene I'm going to update our wiki with the topics, and a few questions raised and the lessons we've learned. The next meetup will be June 24th. Be there, or be... boring :) Cheers, Bradford On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens bradfordsteph...@gmail.com wrote: Greetings, Would anybody be willing to join a PNW Hadoop and/or Lucene User Group with me in the Seattle area? I can donate some facilities, etc. -- I also always have topics to speak about :) Cheers, Bradford
Re: fq vs. q
On Wed, Jun 3, 2009 at 1:53 AM, Marc Sturlese marc.sturl...@gmail.comwrote: It's definitely not proper documentation but maybe can give you a hand: http://www.derivante.com/2009/04/27/100x-increase-in-solr-performance-and-throughput/ Martin Davidsson-2 wrote: I've tried to read up on how to decide, when writing a query, what criteria goes in the q parameter and what goes in the fq parameter, to achieve optimal performance. Is there some documentation that describes how each field is treated internally, or even better, some kind of rule of thumb to help me decide how to split things up when querying against one or more fields. In most cases, I'm looking for exact matches but sometimes an occasional wildcard query shows up too. Thank you! -- Martin -- View this message in context: http://www.nabble.com/fq-vs.-q-tp23845282p23847845.html Sent from the Solr - User mailing list archive at Nabble.com. Thanks, I'd seen that article too. I totally agree that it's worth understanding how things are treated under the hood. That's the kind of literature I'm looking for I guess. Given that article, I wasn't sure what the query would look like if I need to query against multiple fields. Let's say I have a name field and a brand field and I want to find the Apple iPod. Using only the 'q' param the query would look like select?q=brand:Apple AND name:iPod Is there a better query format that utilizes the fq field? Thanks again -- Martin
Re: Solr vs Sphinx
Hi, Could you please start a new thread? Thanks, Otis - Original Message From: sunnyfr johanna...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 10:20:06 AM Subject: Re: Solr vs Sphinx Hi guys, I work now for serveral month on solr and really you provide quick answer ... and you're very nice to work with. But I've got huge issue that I couldn't fixe after lot of post. My indexation take one two days to be done. For 8G of data indexed and 1,5M of docs (ok I've plenty of links in my table but it takes such a long time). Second I've to do update every 20mn but every update represent maybe 20 000docs and when I use the replication I must replicate all the new index folder optimized because Ive too much datas updated and too much segment needs to be generate and I have to merge datas. So I lost my cache and my CPU goes mad. And I can't have more than 20request/sec. Fergus McMenemie-2 wrote: Something that would be interesting is to share solr configs for various types of indexing tasks. From a solr configuration aimed at indexing web pages to one doing large amounts of text to one that indexes specific structured data. I could see those being posted on the wiki and helping folks who say I want to do X, is there an example?. I think most folks start with the example Solr install and tweak from there, which probably isn't the best path... Eric Yep a solr cookbook with lots of different example recipes. However these would need to be very actively maintained to ensure they always represented best practice. While using cocoon I made extensive use of the examples section of the cocoon website. However most of the, massive number of, examples represent obsolete cocoon practise. Or there were four or five examples doing the same thing in different ways with no text explaining the pros/cons of the different approaches. This held me, as a newcomer, back and gave a bad impression of cocoon. I was wondering about a performance hints page. I was caught by an issue indexing CSV content where the use of overwrite=false made an almost 3x difference to my indexing speed. Still do not really know why! On May 15, 2009, at 8:09 AM, Mark Miller wrote: In the spirit of good defaults: I think we should change the Solr highlighter to highlight phrase queries by default, as well as prefix,range,wildcard constantscore queries. Its awkward to have to tell people you have to turn those on. I'd certainly prefer to have to turn them off if I have some limitation rather than on. Yep I agree, all whizzy new features should ideally be on by default unless there is a significant performance penalty. It is not enough that to issue a default solrconfig.xml with the feature on, it has to be on by default inside the code. - Mark - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal Fergus -- View this message in context: http://www.nabble.com/Solr-vs-Sphinx-tp23524676p23852364.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Token filter on multivalue field
Hello, It's ugly, but the first thing that came to mind was ThreadLocal. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: David Giffin da...@giffin.org To: solr-user@lucene.apache.org Sent: Wednesday, June 3, 2009 1:57:42 PM Subject: Token filter on multivalue field Hi There, I'm working on a unique token filter, to eliminate duplicates on a multivalue field. My filter works properly for a single value field. It seems that a new TokenFilter is created for each value in the multivalue field. I need to maintain an array of used tokens across all of the values in the multivalue field. Is there a good way to do this? Here is my current code: public class UniqueTokenFilter extends TokenFilter { private ArrayList words; public UniqueTokenFilter(TokenStream input) { super(input); this.words = new ArrayList(); } @Override public final Token next(Token in) throws IOException { for (Token token=input.next(in); token!=null; token=input.next()) { if ( !words.contains(token.term()) ) { words.add(token.term()); return token; } } return null; } } Thanks, David
Re: Strange behaviour with copyField
On Jun 3, 2009, at 5:09 AM, James Grant wrote: I've been hitting my head against a wall all morning trying to figure this out and haven't managed to get anywhere and wondered if anybody here can help. I have defined a field type fieldType name=text_au class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.LowerCaseTokenizerFactory / /analyzer /fieldType I have two fields field name=au type=text_au indexed=true stored=true required=false multiValued=true/ field name=author type=text_au indexed=true stored=false multiValued=true/ I don't see the difference, as they are the same FieldType for each field, text_au. Is this a typo or am I missing something? and a copyField line copyField source=au dest=author / The idea is to allow searching for authors so a search for author: (Hobbs A.U.) will match the au field value Hobbs A. U. (notice the space). What would lower casing do for handling the space? However the query au:(Hobbs A.U.) matches and the the query author:(Hobbs A.U.) does not. Any ideas? How are you indexing? -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Solr search by segment
I must precise that I am running nutch-solr-integration and both schema.xml are the same in nutch or in solr. -- View this message in context: http://www.nabble.com/Solr-search-by-segment-tp23856569p23859728.html Sent from the Solr - User mailing list archive at Nabble.com.
Which caches should use the solr.FastLRUCache
Hey there, Anyone got any advice on which caches (filterCache, queryResultCache, documentCache, fieldValueCache) should be implemented using the solr.FastLRUCache in solr 1.4 and what are the pros cons vs the solr.LRUCache. Thanks Robert. -- View this message in context: http://www.nabble.com/Which-caches-should-use-the-solr.FastLRUCache-tp23860182p23860182.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: synonyms
I happened to revisit this post that I had started long time back. I'm still using the same query time synonyms. Now i want to be able to map cities to states in the synonyms and continuing to have this issue with the multi-word synonyms. Could you please explain what you've done to overcome this issue again please. I didn't quite understand what HIER_FAMILIY_01, SYN_FAMILY_01 are. Thanks. lorenzo zhak wrote: Hi, I had to work with this kind of sides effects reguarding multiwords synonyms. We installed solr on our project that extensively uses synonyms, a big list that sometimes could bring out some wrong match as the one noticed by Anuvenk for instance dui = drunk driving defense or dui,drunk driving defense,drunk driving law query for dui matches dui = drunk driving defense and dui,drunk driving defense,drunk driving law in order to prevent this kind of behavior I gave for every synonyms family (saying a single line in the file) a unique identifier, so the list looks like : dui = HIER_FAMILIY_01 drunk driving defense = HIER_FAMILIY_01 SYN_FAMILY_01, dui,drunk driving defense,drunk driving law I also set the synonyms filter at index time with expand=false, and at query time with expand=false so in this way, the matched synonyms (multi words or single words) in documents are replaced with their family identifier, and not all the possibilities. Indexing with expand=true will add words in documents that could be matched alone, ignoring the fact that they belong to multiwords expression, and this could end up with a wrong match (intending syns mix) at query time. so in this way a query for dui, will be changed by the synonym filter at query time with HIER_FAMILIY_01 or SYN_FAMILY_01 so documents that contains only single words like drunk, driving or law will not be matched since only a document with the phrase drunk driving law would have been indexed with SYN_FAMILY_01. The approach worked pretty good on our project and we do not notice any sides effects on the searches, it only removes matched documents that were considered as noise of the synonyms mix issue. I think this could be usefull to add this kind of approach on the solr synoyms filter section of the wiki, Cheers Laurent On Dec 2, 2007 3:41 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi (changing to solr-user list) Yes it is, especially if the terms left of = are multi-spaced. Check out the Wiki, one page there explains this nicely. Otis - Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: anuvenk anuvenkat...@hotmail.com To: solr-...@lucene.apache.org Sent: Saturday, December 1, 2007 1:21:49 AM Subject: Re: synonyms Ideally, would it be a good idea to pass the index data through the synonyms filter while indexing? Also, say i have this mapping dui = drunk driving defense or dui,drunk driving defense,drunk driving law so matches for dui, will also bring up matches for drunk driving law (the whole phrase) or does it also bring up all matches for 'drunk' , 'driving','law' ? Yonik Seeley wrote: On Nov 30, 2007 5:39 PM, anuvenk anuvenkat...@hotmail.com wrote: Should data be re-indexed everytime synonyms like word1,word2 or word1 = word2 are added to synonyms.txt Yes, if it changes the index (if it's used in the index anaylzer as opposed to just the query analyzer). -Yonik -- View this message in context: http://www.nabble.com/synonyms-tf4925232.html#a14100346 Sent from the Solr - Dev mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Re%3A-synonyms-tp14116132p23860862.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is there Downside to a huge synonyms file?
I tried adding some city to state mappings in the synonyms file. I'm using the dismax handler for phrase matching. So as when i add more more city to state mappings, I end up with zero results for state based searches. Eg: ca,california,los angeles ca,california,san diego ca,california,san francisco ca,california,burbankand so on now a city based search returns a few other california results but a state based search like dui california is returning zero results. I checked the parsedquery_toString and I see no 'OR' although the default operator is 'OR' in schema. It looks like its trying to find matches for all those cities as they are mapped to 'california' and hence returns zero results. How to force dismax to use 'OR' and not 'AND' even though the schema has 'OR'. Or is this how dismax works? Can someone explain how to overcome this problem. Here is my custom request handler that extends dismax requestHandler name=qfacet class=solr.DisMaxRequestHandler lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qfname^2.0 text^0.8/str !-- until 3 all should match;4 - 3 shld match; 5 - 4 shld match; 6 - 5 shld match; above 6 - 90% match -- str name=mm3lt;-1 4lt;-1 5lt;-1 6lt;90%/str str name=pf text^0.8 name^2.0 /str int name=qs4/int int name=ps4/int str name=fl *,score /str /lst lst name=invariants !--str name=facet.fieldresourceType/str str name=facet.fieldcategory/str str name=facet.fieldstateName/str-- str name=facet.sortfalse/str int name=facet.mincount1/int /lst /requestHandler Thanks. Otis Gospodnetic wrote: Hello, 300K is a pretty small index. I wouldn't worry about the number of synonyms unless you are turning a single term into dozens of ORed terms. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: anuvenk anuvenkat...@hotmail.com To: solr-user@lucene.apache.org Sent: Tuesday, June 2, 2009 11:28:43 PM Subject: Re: Is there Downside to a huge synonyms file? I'm using query time synonyms. I have more fields in my index though. This is just an example or sample of data from my index. Yes, we don't have millions of documents. Could be around 300,000 and might increase in future. The reason i'm using query time synonyms is because of the nature of my data. I can't re-index the data everytime i add or remove a synonym. But for this particular requirement is it best to have index time synonyms because of the multi-word synonym nature. Again if i add more cities list to the synonym file, I can't be re-indexing all the data over and over again. anuvenk wrote: In my index i have legal faqs, forms, legal videos etc with a state field for each resource. Now if i search for real estate san diego, I want to be able to return other 'california' results i.e results from san francisco. I have the following fields in the index title state description... real estate san diego example 1 california some description real estate carlsbad example 2 california some desc so when i search for real estate san francisco, since there is no match, i want to be able to return the other real estate results in california instead of returning none. Because sometimes they might be searching for a real estate form and city probably doesn't matter. I have two things in mind. One is adding a synonym mapping san diego, california carlsbad, california san francisco, california (which probably isn't the best way) hoping that search for san francisco real estate would map san francisco to california and hence return the other two california results OR adding the mapping of city to state in the index itself like.. title state city description... real estate san diego eg 1california carlsbad, san francisco, san diegosome description real estate carlsbad eg 2 california carlsbad, san francisco, san diegosome description which of the above two is better. Does a huge synonym file affect performance. Or Is there a even better way? I'm sure there is but I can't put my finger on it yet I'm not familiar with java either. -- View this message in context: http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23844761.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23861631.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is there Downside to a huge synonyms file?
A small addition to my earlier post. I wonder if its because of the 'mm' param, which requires that until 3 words in search phrase, all the words should be matched. If i alter this now, i'd get ir-relevant results for a lot of popular 1, 2, 3 word search terms. How to solve for this? anuvenk wrote: I tried adding some city to state mappings in the synonyms file. I'm using the dismax handler for phrase matching. So as when i add more more city to state mappings, I end up with zero results for state based searches. Eg: ca,california,los angeles ca,california,san diego ca,california,san francisco ca,california,burbankand so on now a city based search returns a few other california results but a state based search like dui california is returning zero results. I checked the parsedquery_toString and I see no 'OR' although the default operator is 'OR' in schema. It looks like its trying to find matches for all those cities as they are mapped to 'california' and hence returns zero results. How to force dismax to use 'OR' and not 'AND' even though the schema has 'OR'. Or is this how dismax works? Can someone explain how to overcome this problem. Here is my custom request handler that extends dismax requestHandler name=qfacet class=solr.DisMaxRequestHandler lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qfname^2.0 text^0.8/str !-- until 3 all should match;4 - 3 shld match; 5 - 4 shld match; 6 - 5 shld match; above 6 - 90% match -- str name=mm3lt;-1 4lt;-1 5lt;-1 6lt;90%/str str name=pf text^0.8 name^2.0 /str int name=qs4/int int name=ps4/int str name=fl *,score /str /lst lst name=invariants !--str name=facet.fieldresourceType/str str name=facet.fieldcategory/str str name=facet.fieldstateName/str-- str name=facet.sortfalse/str int name=facet.mincount1/int /lst /requestHandler Thanks. Otis Gospodnetic wrote: Hello, 300K is a pretty small index. I wouldn't worry about the number of synonyms unless you are turning a single term into dozens of ORed terms. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: anuvenk anuvenkat...@hotmail.com To: solr-user@lucene.apache.org Sent: Tuesday, June 2, 2009 11:28:43 PM Subject: Re: Is there Downside to a huge synonyms file? I'm using query time synonyms. I have more fields in my index though. This is just an example or sample of data from my index. Yes, we don't have millions of documents. Could be around 300,000 and might increase in future. The reason i'm using query time synonyms is because of the nature of my data. I can't re-index the data everytime i add or remove a synonym. But for this particular requirement is it best to have index time synonyms because of the multi-word synonym nature. Again if i add more cities list to the synonym file, I can't be re-indexing all the data over and over again. anuvenk wrote: In my index i have legal faqs, forms, legal videos etc with a state field for each resource. Now if i search for real estate san diego, I want to be able to return other 'california' results i.e results from san francisco. I have the following fields in the index title state description... real estate san diego example 1 california some description real estate carlsbad example 2 california some desc so when i search for real estate san francisco, since there is no match, i want to be able to return the other real estate results in california instead of returning none. Because sometimes they might be searching for a real estate form and city probably doesn't matter. I have two things in mind. One is adding a synonym mapping san diego, california carlsbad, california san francisco, california (which probably isn't the best way) hoping that search for san francisco real estate would map san francisco to california and hence return the other two california results OR adding the mapping of city to state in the index itself like.. title state city description... real estate san diego eg 1california carlsbad, san francisco, san diegosome description real estate carlsbad eg 2 california carlsbad, san francisco, san diegosome description which of the above two is better. Does a huge synonym file affect performance. Or Is there a even better way? I'm sure there is but I can't put my finger on it yet I'm not familiar with java either. -- View this message in context:
OPI: Article on Sunspot
Sunspot: A Solr-Powered Search Engine for Ruby http://www.linux-mag.com/id/7341 glen http://zzzoot.blogspot.com/ -- -
where to find solr help/consultant
I am implementing solr on Centos server. It involves handling multi-languages. Where is the best place to look for developers experienced in solr who may be interested in a little consulting work. Mostly to give some guidance, etc. IRC is rather quite. Thank you :)