RE: does solr handle hierarchical facets?
I handle this trough the interface I've got dynamics fileds ( path_0, path_1 , ... ) to make it easier. Florent -Message d'origine- De : Sean Laval [mailto:[EMAIL PROTECTED] Envoyé : lundi 10 décembre 2007 14:54 À : solr-user@lucene.apache.org Objet : does solr handle hierarchical facets? eg. category/subcategory/subsubcategory? such that if you search for category, you get all those documents that have been tagged with the category AND any sub categories. If this is possible I think I'll investigate using solr in place of some existing code we have that deals with indexing and searching of such data. Regards, Sean _ Get free emoticon packs and customisation from Windows Live. http://www.pimpmylive.co.uk
RE: How do I search in all fields without index by solr
You have to read the example solrconfig.xml bundled with a fresh install of solr You'll find everything about dismax request handler -Message d'origine- De : Laxmilal Menaria [mailto:[EMAIL PROTECTED] Envoyé : vendredi 7 décembre 2007 09:12 À : solr-user@lucene.apache.org Objet : Re: How do I search in all fields without index by solr I have tried that : ?q=laxmilalqt=dismaxfl=FriendID,Title,Address,PhoneNo,Comments ?q=videoqt=dismaxqf=FriendID,Title,Address,PhoneNo,Comments But both are not return search results, is any configuration in config for that ? my configuration is : fieldType name=string_ch class=solr.StrField analyzer class= org.apache.lucene.analysis.standard.StandardAnalyzer/ /fieldType LM On 12/7/07, SDIS M. Beauchamp [EMAIL PROTECTED] wrote: You can also use the dismaxrequesthandler to search across multiple field -Message d'origine- De : Laxmilal Menaria [mailto:[EMAIL PROTECTED] Envoyé : vendredi 7 décembre 2007 08:25 À : solr-user@lucene.apache.org Objet : Re: How do I search in all fields without index by solr Ok, thanks.. have tried it, It working. But if I use it and may be XXX or YYY value is too long, I think many server dont support long urls so it may give us problem. So is there any configuration in config file for future. LM On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote: You should be able to search any field: ?q=field1:XXX field2:YYY You can register fieldTypes directory to an analyzer using: fieldType name=text_ws class=solr.TextField positionIncrementGap=100 analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer/ /fieldType ryan Laxmilal Menaria wrote: thanks for fast reply, I have dump my index in solr data folder and able to search in single field only, but want to search in all fields. also how can I configure StandradAnalyzer in solr config xml. LM On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote: solr should be able to read any lucene index -- even if it did not create it. The hitch is that you need to make sure the analyzers and fieldTypes match what is in your index otherwise it is unlikely for the result to be what you expect. To get solr to use your manually created index files, just dump them in the data/index directory ryan Laxmilal Menaria wrote: I don't want to use solr for indexing database, I want to use solr for searching on existing index created by me with using my sample application. LM On 12/7/07, Venkatraman S [EMAIL PROTECTED] wrote: On Dec 7, 2007 10:17 AM, Laxmilal Menaria [EMAIL PROTECTED] wrote: Hello everyone, I have created a simple java application which indexes database tables, now I want to configure the solr on my created index. My index has 5 fields, FriendID, Title, Address, PhoneNo and Comments. Why you want to use solr for indexing databases??? !!! rtfm! -- Venkat Blog @ http://blizzardzblogs.blogspot.com -- Thanks, Laxmilal menaria http://www.chambal.com/ http://www.minalyzer.com/ http://www.bucketexplorer.com/ -- Thanks, Laxmilal menaria http://www.chambal.com/ http://www.minalyzer.com/ http://www.bucketexplorer.com/
RE: How do I search in all fields without index by solr
You can also use the dismaxrequesthandler to search across multiple field -Message d'origine- De : Laxmilal Menaria [mailto:[EMAIL PROTECTED] Envoyé : vendredi 7 décembre 2007 08:25 À : solr-user@lucene.apache.org Objet : Re: How do I search in all fields without index by solr Ok, thanks.. have tried it, It working. But if I use it and may be XXX or YYY value is too long, I think many server dont support long urls so it may give us problem. So is there any configuration in config file for future. LM On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote: You should be able to search any field: ?q=field1:XXX field2:YYY You can register fieldTypes directory to an analyzer using: fieldType name=text_ws class=solr.TextField positionIncrementGap=100 analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer/ /fieldType ryan Laxmilal Menaria wrote: thanks for fast reply, I have dump my index in solr data folder and able to search in single field only, but want to search in all fields. also how can I configure StandradAnalyzer in solr config xml. LM On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote: solr should be able to read any lucene index -- even if it did not create it. The hitch is that you need to make sure the analyzers and fieldTypes match what is in your index otherwise it is unlikely for the result to be what you expect. To get solr to use your manually created index files, just dump them in the data/index directory ryan Laxmilal Menaria wrote: I don't want to use solr for indexing database, I want to use solr for searching on existing index created by me with using my sample application. LM On 12/7/07, Venkatraman S [EMAIL PROTECTED] wrote: On Dec 7, 2007 10:17 AM, Laxmilal Menaria [EMAIL PROTECTED] wrote: Hello everyone, I have created a simple java application which indexes database tables, now I want to configure the solr on my created index. My index has 5 fields, FriendID, Title, Address, PhoneNo and Comments. Why you want to use solr for indexing databases??? !!! rtfm! -- Venkat Blog @ http://blizzardzblogs.blogspot.com -- Thanks, Laxmilal menaria http://www.chambal.com/ http://www.minalyzer.com/ http://www.bucketexplorer.com/
RE: I18N with SOLR?
You can have only one default search field But you can use the dismax request handler to search across several fields http://wiki.apache.org/solr/DisMaxRequestHandler Then you can use query field boosting to make one field more significant : Exact_text^3 text_fr^2 text_en^2 stemmed_text^1.5 -Message d'origine- De : Dilip.TS [mailto:[EMAIL PROTECTED] Envoyé : lundi 19 novembre 2007 07:09 À : solr-user@lucene.apache.org Objet : RE: I18N with SOLR? Hello, Also can we have something like this ? i.e having multiple defaultSearchField entries in the schema.xml while searching for a keyword which has a combination of more than 1 language: defaultSearchFieldtext/defaultSearchField defaultSearchFieldtext_french/defaultSearchField... -Original Message- From: Dilip.TS [mailto:[EMAIL PROTECTED] Sent: Monday, November 19, 2007 11:29 AM To: solr-user@lucene.apache.org Subject: RE: I18N with SOLR? Hello, Does SOLR supports searching for a keyword which has a combination of more than 1 language within the same search page? -Original Message- From: Guglielmo Celata [mailto:[EMAIL PROTECTED] Sent: Thursday, November 15, 2007 7:39 PM To: solr-user@lucene.apache.org; [EMAIL PROTECTED] Subject: Re: I18N with SOLR? Hi Dillip, don't know if this helps, but I have set up a TextIt field in the config/schema.xml file, in order to index italian text. It works pretty well with non-ascii characters (we do have some accented vowels, even if not as many as the french). It also works with stopwords (and I assume with protwords as well, though I didn't try). I created an italian-stopwords.txt file in the config/ path. I think the SnowballPorterFilterFactory is a default usable class in Solr, although I remember having read it's a bit slower than other libraries. But I am no expert. fieldtype name=textIt class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.ISOLatin1AccentFilterFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumber s=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=italian-stopwords.txt ignoreCase=true/ filter class=solr.SnowballPorterFilterFactory language=Italian/ /analyzer /fieldtype On 15/11/2007, Dilip.TS [EMAIL PROTECTED] wrote: Hi Ed, Thanks for the help, but i have some queries, i understand that we need to have a stopwords_french.txt and protwords_french.txt files say for french in solr/conf directory. Is it like we need to write the classes like FrenchStopFilterFactory, FrenchPorterFilterFactory for each language or do we have these classes in built in solr? I didnt find them in SOLR/Lucene APIs. I found some classes like org.apache.lucene.analysis.fr.FrenchAnalyzer etc., in lucene-analyzers.jar. Any idea what is this class used for? Thanks in advance, Regards Dilip -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Ed Summers Sent: Monday, November 12, 2007 7:00 PM To: solr-user@lucene.apache.org ; [EMAIL PROTECTED] Subject: Re: I18N with SOLR? I'd say yes. Solr supports Unicode and ships with language specific analyzers, and allows you to provide your own custom analyzers if you need them. This allows you to create different fieldType definitions for the languages you want to support. For example here is an example field type for French text which uses a French stopword list and French stemming. fieldType name=text_french class=solr.TextField analyzer tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.FrenchStopFilterFactory ignoreCase=true words=stopwords_french.txt / filter class= solr.FrenchPorterFilterFactory protected=protwords_french.txt / filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer /fieldType Then you can create a dynamicField definitions that allow you to index and query your documents using the correct field type: dynamicField name=*_french type=text_french indexed=true stored=true/ This means that when you index you need to know what language your data is in so that you know what field names to use in your document (e.g. title_french). And at search time you need to know
RE: Solr PHP client
I use the php and php serialized writer to query Solr from php It's very easy to use But it's not so easy to update solr from php ( that's why my crawlers are not written in php ) Florent BEAUCHAMP -Message d'origine- De : Jonathan Ariel [mailto:[EMAIL PROTECTED] Envoyé : mardi 20 novembre 2007 02:49 À : solr-user@lucene.apache.org Objet : Solr PHP client Hi! I'm wondering if someone is using a PHP client for solr. Actually I'm not sure if there is one out there. Would you be interested in having a SolrJ port for PHP? Thanks, Jonathan Leibiusky
RE: solr - other document formats
The commit can't be false. It can be done or not . If it is not, your users won't be able to search through the uncommited documents. It it's done, users can search through all document successfully sent to Solr. You can use the autocommit feature (in solrconfig.xml) to avoid the explicit usage of commit : you juste have to send documents to Solr Florent BEAUCHAMP -Message d'origine- De : Dwarak R [mailto:[EMAIL PROTECTED] Envoyé : mercredi 14 novembre 2007 13:38 À : solr-user@lucene.apache.org Objet : Re: solr - other document formats Many thanks Florent Hey All My docs are parsed and indexes are updated (using UpdateRichDocuments patch). But tell me onething what will happen if i don't commit ?. If commit is false where the docs are stored ?. Regards Dwarak R - Original Message - From: SDIS M. Beauchamp [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, November 14, 2007 1:13 PM Subject: RE: solr - other document formats You should take a look at http://wiki.apache.org/solr/UpdateRichDocuments?highlight=%28richdocument%29 It gives you a starting point to make the extractor you need Regards Florent -Message d'origine- De : Dwarak R [mailto:[EMAIL PROTECTED] Envoyé : mercredi 14 novembre 2007 05:17 À : solr-user@lucene.apache.org Objet : solr - other document formats Hey All I read an article on http://www.xml.com/lpt/a/1668 Its states that As we've seen, the XML format used by Solr for indexing is quite simple. Extracting the relevant metadata to create these XML documents from the many formats floating around, however, is another story. Fortunately, Lucene users have the same problem and have been working on it for quite a while; the Lucene FAQ lists a number of references to parsers and filters which can be used to extract content and metadata from many common document formats. Solr won't index spreadsheets or other formats out of the box, but that is not its role: you should see Solr as the search engine component of a broader search system, where extraction of content and metadata is handled by other components. This will help to keep your search system maintainable and testable, and it helps the Solr team focus on doing one thing well. Parsing documents like pdf, ms word document, excel to xml will be done other component ? Somebody advise Regards Dwarak R This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender[EMAIL PROTECTED] immediately and delete the original. Any other use of the email by you is prohibited. This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, please notify the sender[EMAIL PROTECTED] immediately and delete the original. Any other use of the email by you is prohibited.
no segments* file found
I'm using solr to index our files servers ( 480K files ) If I don't optimize, I 've got a too many files open at about 450K files and 3 Gb index If i optimize I've got this stacktrace during the commit of all the following update result status=1java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@/root/trunk/example/solr/data/index: files: _7xr.tis _7xt.fdt _7o1.tii _7xq.tis _7xn.nrm _7ws.fdt _7xt.prx _7xp.nrm _7ws.nrm _7xo.nrm _7ws.tis _7xs.fdt _7vc.fnm _7u6.tis _7vx.fnm _7vx.frq _7xs.nrm _7xn.tis _7xq.frq _7xs.tis _7xq.prx _7vx.fdx _7ur.tii _7ur.frq _7xq.fnm _7xr.nrm _7vc.fdt _7xt.frq _7xp.fdx _7ws.prx _7xs.frq _7xo.prx _7xq.nrm _7vx.tii _7vx.prx _7xq.tii _7xs.fnm _7xs.tii _7ws.tii _7xt.fdx _7vc.nrm _7vc.prx _7vc.tis _7xq.fdt _7ur.prx _7xn.fdx _7xp.frq _7vx.nrm _7ur.fdt _7xr.fnm _7ws.fdx _7u6.tii _7xr.tii _7vc.frq _7vx.tis _7xp.fdt _7xr.frq_7ur.tis _7xp.prx _7xr.fdx _7xt.fnm _7xn.tii _7vc.fdx _7xo.fdt _7u6.fnm _7xn.frq _7xp.tis _7o1.frq _7xn.prx _7ur.fdx _7ur.fnm _7o1.fdx _7xs.fdx _7xn.fdt _7xt.tis _7xp.fnm _7xo.fnm _7xn.fnm _7u6.prx _7xq.fdx _7xo.tii _7ws.fnm _7vc.tii _7o1.prx _7xr.fdt _7o1.fdt _7ur.nrm _7ws.frq _7u6.nrm _7o1.nrm _7vx.fdt _7xt.tii _7u6.fdx _7xo.frq _7u6.frq _7xs.prx _7xr.prx _7o1.tis _7xt.nrm _7xp.tii _7xo.tis _7u6.fdt _7xo.fdx _7o1.fnm segments.gen at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfo s.java:516) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:243) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:616) at org.apache.lucene.index.IndexWriter.lt;initgt;(IndexWriter.java:410) at org.apache.solr.update.SolrIndexWriter.lt;initgt;(SolrIndexWriter.java :97) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler .java:121) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandl er2.java:189) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2. java:267) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdate ProcessorFactory.java:67) at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateR equestHandler.java:196) at org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpdate RequestHandler.java:386) at org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java: 57) /result If I restart solr I've got a NullPointerException in DispatchFilter tested with solr 1.2 and 1.3 , the behaviour is the same Regards Florent BEAUCHAMP
RE: no segments* file found
No , I'm using a custom indexer, written in C# which submits content using some post request. I let lucene manage the index on his own Florent BEAUCHAMP -Message d'origine- De : Venkatraman S [mailto:[EMAIL PROTECTED] Envoyé : lundi 12 novembre 2007 10:19 À : solr-user@lucene.apache.org Objet : Re: no segments* file found are you using embedded solr? I had stumbled on a similar error : http://www.mail-archive.com/solr-user@lucene.apache.org/msg06085.html -V On Nov 12, 2007 2:16 PM, SDIS M. Beauchamp [EMAIL PROTECTED] wrote: I'm using solr to index our files servers ( 480K files ) If I don't optimize, I 've got a too many files open at about 450K files and 3 Gb index If i optimize I've got this stacktrace during the commit of all the following update result status=1java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@/root/trunk/example/solr/data/index: files: _7xr.tis _7xt.fdt _7o1.tii _7xq.tis _7xn.nrm _7ws.fdt _7xt.prx _7xp.nrm _7ws.nrm _7xo.nrm _7ws.tis _7xs.fdt _7vc.fnm _7u6.tis _7vx.fnm _7vx.frq _7xs.nrm _7xn.tis _7xq.frq _7xs.tis _7xq.prx _7vx.fdx _7ur.tii _7ur.frq _7xq.fnm _7xr.nrm _7vc.fdt _7xt.frq _7xp.fdx _7ws.prx _7xs.frq _7xo.prx _7xq.nrm _7vx.tii _7vx.prx _7xq.tii _7xs.fnm _7xs.tii _7ws.tii _7xt.fdx _7vc.nrm _7vc.prx _7vc.tis _7xq.fdt _7ur.prx _7xn.fdx _7xp.frq _7vx.nrm _7ur.fdt _7xr.fnm _7ws.fdx _7u6.tii _7xr.tii _7vc.frq _7vx.tis _7xp.fdt _7xr.frq_7ur.tis _7xp.prx _7xr.fdx _7xt.fnm _7xn.tii _7vc.fdx _7xo.fdt _7u6.fnm _7xn.frq _7xp.tis _7o1.frq _7xn.prx _7ur.fdx _7ur.fnm _7o1.fdx _7xs.fdx _7xn.fdt _7xt.tis _7xp.fnm _7xo.fnm _7xn.fnm _7u6.prx _7xq.fdx _7xo.tii _7ws.fnm _7vc.tii _7o1.prx _7xr.fdt _7o1.fdt _7ur.nrm _7ws.frq _7u6.nrm _7o1.nrm _7vx.fdt _7xt.tii _7u6.fdx _7xo.frq _7u6.frq _7xs.prx _7xr.prx _7o1.tis _7xt.nrm _7xp.tii _7xo.tis _7u6.fdt _7xo.fdx _7o1.fnm segments.gen at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfo s.java:516) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:243) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:616) at org.apache.lucene.index.IndexWriter.lt;initgt;(IndexWriter.java:410) at org.apache.solr.update.SolrIndexWriter.lt ;initgt;(SolrIndexWriter.java :97) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandl er .java:121) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHan dl er2.java:189) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2. java:267) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpda te ProcessorFactory.java :67) at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdat eR equestHandler.java:196) at org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpda te RequestHandler.java :386) at org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java: 57) /result If I restart solr I've got a NullPointerException in DispatchFilter tested with solr 1.2 and 1.3 , the behaviour is the same Regards Florent BEAUCHAMP --