underscores are parsed only as spaces
Hi, I don't get why and how to change this: underscores are parsed only as spaces, meaning that a search for user ejekt_festival will return zero results, while ejekt festival will return the user ejekt_festival. Thanks for your help, -- View this message in context: http://www.nabble.com/underscores-are-parsed-only-as-spaces-tp23132245p23132245.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: underscores are parsed only as spaces
On Mon, Apr 20, 2009 at 1:46 PM, sunnyfr johanna...@gmail.com wrote: Hi, I don't get why and how to change this: underscores are parsed only as spaces, meaning that a search for user ejekt_festival will return zero results, while ejekt festival will return the user ejekt_festival. I think the field type does not have the same analyzers on both query-time and index-time. Using analysis.jsp on the Solr admin page, you can see how an input string is analyzed during query time and index time. -- Regards, Shalin Shekhar Mangar.
Big Problem with special characters
Hello, first some details about my SOLR installation: schema.xml fieldType name=text_test class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LengthFilterFactory min=2 max=50 / filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=german / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=german / /analyzer /fieldType Search: qf=name^2.0+name2^1.5+name3 wt=phps rows=30 start=0 sort=score+desc fl=*,score q=speed qt=dismax When I have a string like (speed) in name3 or name2 SOLR dont find it at all :-( If I search for (speed) erverything is fine ! Greets -Ralf-
RE: OutofMemory on Highlightling
Anybody facing the same issue? Following is my configuration ... field name=content type=text indexed=true stored=false multiValued=true/ field name=teaser type=text indexed=false stored=true/ copyField source=content dest=teaser maxChars=100 / ... ... requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str name=echoParamsexplicit/str int name=rows500/int str name=hltrue/str str name=flid,score/str str name=hl.flteaser/str str name=hl.alternateFieldteaser/str int name=hl.fragsize200/int int name=hl.maxAlternateFieldLength200/int int name=hl.maxAnalyzedChars500/int /lst /requestHandler ... Search works fine if I disable highlighting and it brings 500 results. But if I enable hightlighting and set the no. of rows to just 20 I get OOME. -Original Message- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Friday, April 17, 2009 11:32 AM To: solr-user@lucene.apache.org Subject: RE: OutofMemory on Highlightling I tried hl.maxAnalyzedChars=500 but still the same issue. I get OOM for row size 20 only. -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Thursday, April 16, 2009 9:56 PM To: solr-user@lucene.apache.org Subject: Re: OutofMemory on Highlightling Hi, Have you tried: http://wiki.apache.org/solr/HighlightingParameters#head-2ca22f63cb8d1b2b a3ff0cfc05e85b94898c59cf Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Gargate, Siddharth sgarg...@ptc.com To: solr-user@lucene.apache.org Sent: Thursday, April 16, 2009 6:33:46 AM Subject: OutofMemory on Highlightling Hi, I am analyzing the memory usage for my Solr setup. I am testing with 500 text documents of 2 MB each. I have defined a field for displaying the teasers and storing 1 MB of text in it. I am testing with just 128 MB maxHeap(I know I should be increasing it but just testing the worst case scenario). If I search for all 500 documents with row size as 500 and highlighting disabled, it works fine. But if I enable highlighting I get OutofMemoryError. Looks like stored field for all the matched results are read into the memory. How to avoid this memory consumption? Thanks, Siddharth
Re: Search on all fields and know in which field was the match
On Tue, Apr 14, 2009 at 9:54 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: one option is to index each attachemnt as it's own document *in addition* to indexing each email will all of hte attachment text in a single atachments field. that way you can search for all emails where Bob is mentioned in an attachment -- but if you want to know which specific attaahments mention bob you can do that search as well. Thank you for your reply, Another possible solution is to set those dynamic fields with store=true. I will need highlight feature in those fields so i think storage isn't a problem by now. Rui Carneiro
Re: Customizing solr with my lucene
Hi, Here's the schema.xml i am using. ?xml version=1.0 encoding=UTF-8 ? schema name=myschema version=1.1 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=text class=solr.TextField positionIncrementGap=100/ /types fields field name=id type=string indexed=true stored=true required=true multiValued=true / field name=name type=text indexed=true stored=true multiValued=true / field name=value type=text indexed=true stored=true multiValued=true / dynamicField name=* type=text indexed=true stored=true/ /fields uniqueKeyid/uniqueKey defaultSearchFieldvalue/defaultSearchField solrQueryParser defaultOperator=OR/ similarity class=org.apache.lucene.search.DefaultSimilarity/ /schema I can't figure out the error. Do u see any probs with the current schema. The schema is defined as such because i have implemented the my own analyzer an tokenizer into the lucene index writer. So if solr calls the index writer then there should be no prob.Similarly i have made my changes to the lucene query parser so i want the default lucene query handling. So basically i intend to use the solr just to directly call my lucene. Is it possible or would i have to change the solr code?? One more prob i faced. I want to use dynamic fields=* but it doesn't seem to work. As when i remove fieldName=name from the above list no content error is reported (as my input file has name fields and dynamic field doesn't seem to catch it) Any ideas on this. Thanx. Allahbaksh Asadullah-2 wrote: Just check your schema.xmlRegards, Allahbaksh On Fri, Apr 17, 2009 at 7:56 PM, mirage1987 mirage1...@gmail.com wrote: Hey Erik, I also checked the index using luke and the index shows that the terms are indexed as they should have been. So that implies that something is wrong with the querying only and the results are not getting retrieved.(As i said earlier even the parsed query is the way it should be according to the changes i have made to lucene.) Any ideas you have on this. Why this could be happening. One more thing... tried to query the solr index using luke ...but still no resultsmay be the index is not stored correctlycould it be changes in the lucene api???should i revert to an older version of solr??? -- View this message in context: http://www.nabble.com/Customizing-solr-with-my-lucene-tp23038007p23098700.html Sent from the Solr - User mailing list archive at Nabble.com. -- Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronic City, Hosur Road, Bangalore 560 100, India. (Board: 91-80-28520261 | Extn: 73927 | Direct: 41173927. Fax: 91-80-28520362 | Mobile: 91-9845505322. -- View this message in context: http://www.nabble.com/Customizing-solr-with-my-lucene-tp23038007p23133779.html Sent from the Solr - User mailing list archive at Nabble.com.
ExtractingRequestHandler and SolrRequestHandler issue
Hi all, I am unsuccessfully attempting to use the ExtractingRequestHandler (indexing documents via Tika, Solr cell). I start Solr from the example app (start.jar), but point to my own Solr conf, where I have requestHandler name=/update/extract class=org.apache.solr.handler.extraction.ExtractingRequestHandler lst name=defaults str name=ext.map.Last-Modifiedlast_modified/str bool name=ext.ignore.und.fltrue/bool /lst /requestHandler Using the nightly builds (2009-04-17). I followed Getting Started with the Solr Example at http://wiki.apache.org/solr/ExtractingRequestHandler, but I got plenty of missing classes. So I had to copy all jars over from example/solr/lib to example/lib. And now, when I fire Jetty (start.jar) I am getting: SEVERE: java.lang.ClassCastException: org.apache.solr.handler.extraction.ExtractingRequestHandler cannot be cast to org.apache.solr.request.SolrRequestHandler I first thought that could be an incompatibility on the solr.war under webapps. I made sure that war was the nightly war. But I still get the exception. Does anyone know what I am doing wrong here? Or how am I supposed to make Solr cell work? Thanks, Francisco
Re: OutofMemory on Highlightling
Gargate, Siddharth wrote: Anybody facing the same issue? Following is my configuration ... field name=content type=text indexed=true stored=false multiValued=true/ field name=teaser type=text indexed=false stored=true/ copyField source=content dest=teaser maxChars=100 / ... ... requestHandler name=standard class=solr.SearchHandler default=true lst name=defaults str name=echoParamsexplicit/str int name=rows500/int str name=hltrue/str str name=flid,score/str str name=hl.flteaser/str str name=hl.alternateFieldteaser/str int name=hl.fragsize200/int int name=hl.maxAlternateFieldLength200/int int name=hl.maxAnalyzedChars500/int /lst /requestHandler ... Search works fine if I disable highlighting and it brings 500 results. But if I enable hightlighting and set the no. of rows to just 20 I get OOME. How about switching documentCache off? Koji
Using Solr to index a database
Hello, I've never used Solr before, but I believe that it will suit my current needs with indexing information from a database. I downloaded and extracted Solr 1.3 to play around with it. I've been looking at the following tutorials: http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DataImportHandler There are a few things I don't understand. For example, the IBM article sometimes refers to directories that aren't there, or a little different from what I have in my extracted copy of Solr (ie solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can, but as soon as I put the following in solrconfig.xml, the whole thing breaks: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configrss-data-config.xml/str /lst /requestHandler Obviously I replace with my own info...One thing I don't quite get is the data-config.xml file. What exactly is it? I've seen examples of what it contains but since I don't know enough, I couldn't really adjust it. In any case, this is the error I get, which may be because of a misconfigured data-config.xml... org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:99) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by: org.xml.sax.SAXParseException: The element type document must be terminated by the matching end-tag /document. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153) It's unclear to me what I need to be using, as in what directories/files I need to implement this. Can someone please point me in the right direction? BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't work for me. It shows that it started in the command line, but it hangs, and doesn't actually work when I try to hit the Solr admin page (page not found type error). Jetty itself does start but the project doesn't seem to deploy... I apologize for the long post and if I didn't provide as much information as I should. Let me know if you need clarification with anything I said. Thank you very much.
Re: ExtractingRequestHandler and SolrRequestHandler issue
Can you give the full stack trace? On Apr 20, 2009, at 6:49 AM, francisco treacy wrote: Hi all, I am unsuccessfully attempting to use the ExtractingRequestHandler (indexing documents via Tika, Solr cell). I start Solr from the example app (start.jar), but point to my own Solr conf, where I have requestHandler name=/update/extract class=org.apache.solr.handler.extraction.ExtractingRequestHandler lst name=defaults str name=ext.map.Last-Modifiedlast_modified/str bool name=ext.ignore.und.fltrue/bool /lst /requestHandler Using the nightly builds (2009-04-17). I followed Getting Started with the Solr Example at http://wiki.apache.org/solr/ExtractingRequestHandler, but I got plenty of missing classes. So I had to copy all jars over from example/solr/lib to example/lib. And now, when I fire Jetty (start.jar) I am getting: SEVERE: java.lang.ClassCastException: org.apache.solr.handler.extraction.ExtractingRequestHandler cannot be cast to org.apache.solr.request.SolrRequestHandler I first thought that could be an incompatibility on the solr.war under webapps. I made sure that war was the nightly war. But I still get the exception. Does anyone know what I am doing wrong here? Or how am I supposed to make Solr cell work? Thanks, Francisco -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Using Solr to index a database
You have not indicated how you wish to use the index (inside Solr or not). It is possible that LuSql might be an preferable alternative to Solr/DataImportHandler, depending on your requirements. LuSql: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql Disclaimer: I am the author of LuSql. -glen 2009/4/20 ahammad ahmed.ham...@gmail.com: Hello, I've never used Solr before, but I believe that it will suit my current needs with indexing information from a database. I downloaded and extracted Solr 1.3 to play around with it. I've been looking at the following tutorials: http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DataImportHandler There are a few things I don't understand. For example, the IBM article sometimes refers to directories that aren't there, or a little different from what I have in my extracted copy of Solr (ie solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can, but as soon as I put the following in solrconfig.xml, the whole thing breaks: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configrss-data-config.xml/str /lst /requestHandler Obviously I replace with my own info...One thing I don't quite get is the data-config.xml file. What exactly is it? I've seen examples of what it contains but since I don't know enough, I couldn't really adjust it. In any case, this is the error I get, which may be because of a misconfigured data-config.xml... org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:99) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by: org.xml.sax.SAXParseException: The element type document must be terminated by the matching end-tag /document. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153) It's unclear to me what I need to be using, as in what directories/files I need to implement this. Can someone please point me in the right direction? BTW, I'm using Tomcat 5.5 because
Re: CollapseFilter with the latest Solr in trunk
What are the current issues holding this back? Seems to be working with some minor bug fixes. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Otis Gospodnetic otis_gospodne...@yahoo.com Reply-To: solr-user@lucene.apache.org Date: Sun, 19 Apr 2009 20:30:22 -0700 (PDT) To: solr-user@lucene.apache.org Subject: Re: CollapseFilter with the latest Solr in trunk Once somebody really makes it work, I'm sure it will be released! Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Antonio Eggberg antonio_eggb...@yahoo.se To: solr-user@lucene.apache.org Sent: Sunday, April 19, 2009 9:21:20 PM Subject: Re: CollapseFilter with the latest Solr in trunk I wish it would be planned for 1.4 :)) --- Den sön 2009-04-19 skrev Otis Gospodnetic : Från: Otis Gospodnetic Ämne: Re: CollapseFilter with the latest Solr in trunk Till: solr-user@lucene.apache.org Datum: söndag 19 april 2009 15.06 Thanks for sharing! It would be good if you (of Jeff from Zappos or anyone making changes to this) could put up a new patch for this most-voted-JIRA-issue. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: climbingrose To: solr-user@lucene.apache.org Sent: Sunday, April 19, 2009 8:12:11 AM Subject: Re: CollapseFilter with the latest Solr in trunk Ok, here is how I fixed this problem: public DocListAndSet getDocListAndSet(Query query, ListfilterList, DocSet docSet, Sort lsort, int offset, int len, int flags) throwsIOException { //DocListAndSet ret = new DocListAndSet(); //getDocListC(ret,query,filterList,docSet,lsort,offset,len, flags |= GET_DOCSET); DocSet theFilt = getDocSet(filterList); if (docSet != null) theFilt = (theFilt != null) ? theFilt.intersection(docSet) : docSet; QueryCommand qc = new QueryCommand(); qc.setQuery(query).setFilter(theFilt); qc.setSort(lsort).setOffset(offset).setLen(len).setFlags(flags |= GET_DOCSET); QueryResult result = new QueryResult(); getDocListC(result,qc); return result.getDocListAndSet(); } There is also one-off error in CollapseFilter which you can find solution on Jira. Cheers, Cuong On Sat, Apr 18, 2009 at 4:41 AM, Jeff Newburn wrote: We are currently trying to do the same thing. With the patch unaltered we can use fq as long as collapsing is turned on. If we just send a normal document level query with an fq parameter it blows up. Additionally, it does not appear that the collapse.facet option works at all. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: climbingrose Reply-To: Date: Fri, 17 Apr 2009 16:53:00 +1000 To: solr-user Subject: CollapseFilter with the latest Solr in trunk Hi all, Have any one try to use CollapseFilter with the latest version of Solr in trunk? However, it looks like Solr 1.4 doesn't allow calling setFilterList() and setFilter() on one instance of the QueryCommand. I modified the code in QueryCommand to allow this: public QueryCommand setFilterList(Query f) { // if( filter != null ) { //throw new IllegalArgumentException( Either filter or filterList may be set in the QueryCommand, but not both. ); // } filterList = null; if (f != null) { filterList = new ArrayList(2); filterList.add(f); } return this; } However, I still have a problem which prevent query filters from working when used in conjunction with CollapseFilter. In other words, query filters doesn't seem to have any effects on the result set when CollapseFilter is used. The other problem is related to OpenBitSet: java.lang.ArrayIndexOutOfBoundsException: 2183 at org.apache.lucene.util.OpenBitSet.fastSet(OpenBitSet.java:242) at org.apache.solr.search.CollapseFilter.addDoc(CollapseFilter.java:202) at org.apache.solr.search.CollapseFilter.adjacentCollapse(CollapseFilter.java:16 1 ) at org.apache.solr.search.CollapseFilter.(CollapseFilter.java:141) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java :2 17) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHand le r.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase. ja va:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:30 3 ) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 23 2) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application Fi lterChain.java:202) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh ai
Re: Using Solr to index a database
For now it's unclear, as this is sort of an experiment to see how much we can do with it. I am inclined to use the index within Solr though, simply for the very powerful querying (the stuff I've seen at least). I am not exactly sure how much of the querying capabilities I'll require though. I'll take a look at LuSql and see if it can be used for my purposes. I want to get Solr working though, because I know that later down the road I'm going to need it for another project... Glen Newton wrote: You have not indicated how you wish to use the index (inside Solr or not). It is possible that LuSql might be an preferable alternative to Solr/DataImportHandler, depending on your requirements. LuSql: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql Disclaimer: I am the author of LuSql. -glen 2009/4/20 ahammad ahmed.ham...@gmail.com: Hello, I've never used Solr before, but I believe that it will suit my current needs with indexing information from a database. I downloaded and extracted Solr 1.3 to play around with it. I've been looking at the following tutorials: http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DataImportHandler There are a few things I don't understand. For example, the IBM article sometimes refers to directories that aren't there, or a little different from what I have in my extracted copy of Solr (ie solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can, but as soon as I put the following in solrconfig.xml, the whole thing breaks: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configrss-data-config.xml/str /lst /requestHandler Obviously I replace with my own info...One thing I don't quite get is the data-config.xml file. What exactly is it? I've seen examples of what it contains but since I don't know enough, I couldn't really adjust it. In any case, this is the error I get, which may be because of a misconfigured data-config.xml... org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:99) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by: org.xml.sax.SAXParseException: The
Re: ExtractingRequestHandler and SolrRequestHandler issue
Hi Grant, Here is the full stacktrace: 20-Apr-2009 12:36:39 org.apache.solr.common.SolrException log SEVERE: java.lang.ClassCastException: org.apache.solr.handler.extraction.ExtractingRequestHandler cannot be cast to org.apache.solr.request.SolrRequestHandler at org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:154) at org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:163) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:171) at org.apache.solr.core.SolrCore.init(SolrCore.java:535) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Thanks Francisco 2009/4/20 Grant Ingersoll gsing...@apache.org: Can you give the full stack trace? On Apr 20, 2009, at 6:49 AM, francisco treacy wrote: Hi all, I am unsuccessfully attempting to use the ExtractingRequestHandler (indexing documents via Tika, Solr cell). I start Solr from the example app (start.jar), but point to my own Solr conf, where I have requestHandler name=/update/extract class=org.apache.solr.handler.extraction.ExtractingRequestHandler lst name=defaults str name=ext.map.Last-Modifiedlast_modified/str bool name=ext.ignore.und.fltrue/bool /lst /requestHandler Using the nightly builds (2009-04-17). I followed Getting Started with the Solr Example at http://wiki.apache.org/solr/ExtractingRequestHandler, but I got plenty of missing classes. So I had to copy all jars over from example/solr/lib to example/lib. And now, when I fire Jetty (start.jar) I am getting: SEVERE: java.lang.ClassCastException: org.apache.solr.handler.extraction.ExtractingRequestHandler cannot be cast to org.apache.solr.request.SolrRequestHandler I first thought that could be an incompatibility on the solr.war under webapps. I made sure that war was the nightly war. But I still get the exception. Does anyone know what I am doing wrong here? Or how am I supposed to make Solr cell work? Thanks, Francisco -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Solr webinar
(excuse the cross-post) I'm presenting a webinar on Solr. Registration is limited, so sign up soon. Looking forward to seeing some of you there! Thanks, Erik Got data? You can build your own Solr-powered Search Engine! Erik Hatcher, Lucene/Solr Committer and author, will show you how you how to use Solr to build an Enterprise Search engine that indexes a variety data sources all in a matter of minutes! Thursday, April 30, 2009 11:00AM - 12:00PM PDT / 2:00PM - 3:00PM EDT Sign up for this free webinar today at http://www2.eventsvc.com/lucidimagination/?trk=E1
Re: CollapseFilter with the latest Solr in trunk
I have not looked at this in a while, but I think the biggest thing it is missing right now is a champion -- someone to get the patches (and bug fixes) to a state where it can easily be committed. Minor bug fixes are road blocks to getting things integrated. ryan On Apr 20, 2009, at 10:16 AM, Jeff Newburn wrote: What are the current issues holding this back? Seems to be working with some minor bug fixes. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Otis Gospodnetic otis_gospodne...@yahoo.com Reply-To: solr-user@lucene.apache.org Date: Sun, 19 Apr 2009 20:30:22 -0700 (PDT) To: solr-user@lucene.apache.org Subject: Re: CollapseFilter with the latest Solr in trunk Once somebody really makes it work, I'm sure it will be released! Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Antonio Eggberg antonio_eggb...@yahoo.se To: solr-user@lucene.apache.org Sent: Sunday, April 19, 2009 9:21:20 PM Subject: Re: CollapseFilter with the latest Solr in trunk I wish it would be planned for 1.4 :)) --- Den sön 2009-04-19 skrev Otis Gospodnetic : Från: Otis Gospodnetic Ämne: Re: CollapseFilter with the latest Solr in trunk Till: solr-user@lucene.apache.org Datum: söndag 19 april 2009 15.06 Thanks for sharing! It would be good if you (of Jeff from Zappos or anyone making changes to this) could put up a new patch for this most-voted-JIRA-issue. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: climbingrose To: solr-user@lucene.apache.org Sent: Sunday, April 19, 2009 8:12:11 AM Subject: Re: CollapseFilter with the latest Solr in trunk Ok, here is how I fixed this problem: public DocListAndSet getDocListAndSet(Query query, ListfilterList, DocSet docSet, Sort lsort, int offset, int len, int flags) throwsIOException { //DocListAndSet ret = new DocListAndSet(); //getDocListC(ret,query,filterList,docSet,lsort,offset,len, flags |= GET_DOCSET); DocSet theFilt = getDocSet(filterList); if (docSet != null) theFilt = (theFilt != null) ? theFilt.intersection(docSet) : docSet; QueryCommand qc = new QueryCommand(); qc.setQuery(query).setFilter(theFilt); qc.setSort(lsort).setOffset(offset).setLen(len).setFlags(flags |= GET_DOCSET); QueryResult result = new QueryResult(); getDocListC(result,qc); return result.getDocListAndSet(); } There is also one-off error in CollapseFilter which you can find solution on Jira. Cheers, Cuong On Sat, Apr 18, 2009 at 4:41 AM, Jeff Newburn wrote: We are currently trying to do the same thing. With the patch unaltered we can use fq as long as collapsing is turned on. If we just send a normal document level query with an fq parameter it blows up. Additionally, it does not appear that the collapse.facet option works at all. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: climbingrose Reply-To: Date: Fri, 17 Apr 2009 16:53:00 +1000 To: solr-user Subject: CollapseFilter with the latest Solr in trunk Hi all, Have any one try to use CollapseFilter with the latest version of Solr in trunk? However, it looks like Solr 1.4 doesn't allow calling setFilterList() and setFilter() on one instance of the QueryCommand. I modified the code in QueryCommand to allow this: public QueryCommand setFilterList(Query f) { // if( filter != null ) { //throw new IllegalArgumentException( Either filter or filterList may be set in the QueryCommand, but not both. ); // } filterList = null; if (f != null) { filterList = new ArrayList(2); filterList.add(f); } return this; } However, I still have a problem which prevent query filters from working when used in conjunction with CollapseFilter. In other words, query filters doesn't seem to have any effects on the result set when CollapseFilter is used. The other problem is related to OpenBitSet: java.lang.ArrayIndexOutOfBoundsException: 2183 at org.apache.lucene.util.OpenBitSet.fastSet(OpenBitSet.java:242) at org.apache.solr.search.CollapseFilter.addDoc(CollapseFilter.java: 202) at org .apache .solr.search.CollapseFilter.adjacentCollapse(CollapseFilter.java:16 1 ) at org.apache.solr.search.CollapseFilter.(CollapseFilter.java:141) at org .apache .solr.handler.component.QueryComponent.process(QueryComponent.java :2 17) at org .apache .solr.handler.component.SearchHandler.handleRequestBody(SearchHand le r.java:195) at org .apache .solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase. ja va:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org .apache .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:30 3 ) at org .apache .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 23 2)
Re: Solr webinar
Hello Erik, I'm interested in attending the Webinar. I just have some questions to verify whether or not I am fit to attend... 1) How will it be carried out? What software or application would I need? 2) Do I have to have any experience or can I attend for the purpose of learning about Solr? Thanks for taking time to do this. Regards Erik Hatcher wrote: (excuse the cross-post) I'm presenting a webinar on Solr. Registration is limited, so sign up soon. Looking forward to seeing some of you there! Thanks, Erik Got data? You can build your own Solr-powered Search Engine! Erik Hatcher, Lucene/Solr Committer and author, will show you how you how to use Solr to build an Enterprise Search engine that indexes a variety data sources all in a matter of minutes! Thursday, April 30, 2009 11:00AM - 12:00PM PDT / 2:00PM - 3:00PM EDT Sign up for this free webinar today at http://www2.eventsvc.com/lucidimagination/?trk=E1 -- View this message in context: http://www.nabble.com/Solr-webinar-tp23138157p23138451.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr webinar
Thanks Erik! Looking forward to it. Matt On Mon, Apr 20, 2009 at 11:00 AM, ahammad ahmed.ham...@gmail.com wrote: Hello Erik, I'm interested in attending the Webinar. I just have some questions to verify whether or not I am fit to attend... 1) How will it be carried out? What software or application would I need? 2) Do I have to have any experience or can I attend for the purpose of learning about Solr? Thanks for taking time to do this. Regards Erik Hatcher wrote: (excuse the cross-post) I'm presenting a webinar on Solr. Registration is limited, so sign up soon. Looking forward to seeing some of you there! Thanks, Erik Got data? You can build your own Solr-powered Search Engine! Erik Hatcher, Lucene/Solr Committer and author, will show you how you how to use Solr to build an Enterprise Search engine that indexes a variety data sources all in a matter of minutes! Thursday, April 30, 2009 11:00AM - 12:00PM PDT / 2:00PM - 3:00PM EDT Sign up for this free webinar today at http://www2.eventsvc.com/lucidimagination/?trk=E1 -- View this message in context: http://www.nabble.com/Solr-webinar-tp23138157p23138451.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr webinar
I replied to this off-list, and will do so for future questions about the webinar. Please direct them to me personally rather than the list. But in short, no Solr experience is necessary, and it's purpose is to educate about Solr. If you're already developing with Solr you're likely overqualified for the webinar, as it will be relatively high-level. Erik On Apr 20, 2009, at 11:00 AM, ahammad wrote: Hello Erik, I'm interested in attending the Webinar. I just have some questions to verify whether or not I am fit to attend... 1) How will it be carried out? What software or application would I need? 2) Do I have to have any experience or can I attend for the purpose of learning about Solr? Thanks for taking time to do this. Regards Erik Hatcher wrote: (excuse the cross-post) I'm presenting a webinar on Solr. Registration is limited, so sign up soon. Looking forward to seeing some of you there! Thanks, Erik Got data? You can build your own Solr-powered Search Engine! Erik Hatcher, Lucene/Solr Committer and author, will show you how you how to use Solr to build an Enterprise Search engine that indexes a variety data sources all in a matter of minutes! Thursday, April 30, 2009 11:00AM - 12:00PM PDT / 2:00PM - 3:00PM EDT Sign up for this free webinar today at http://www2.eventsvc.com/lucidimagination/?trk=E1 -- View this message in context: http://www.nabble.com/Solr-webinar-tp23138157p23138451.html Sent from the Solr - User mailing list archive at Nabble.com.
Customization of solr
Hi, I have some years of experience with lucene and I am knowing solr now. I see that many processes are encapsulated in the API. My doubts is on the level of customization of solr. Is it possible to create my units of searches in solr having: 1- Send of seed for ramdomize for my sort criteria 2- Implement my proper algorithm of suggest. I think that to implement my proper requestHandler and searchComponent. Thanks _ Descubra seu lado desconhecido com o novo Windows Live! http://www.windowslive.com.br
Re: Customization of solr
On Mon, Apr 20, 2009 at 11:46 AM, HPN 75 haroldo.s...@hotmail.com wrote: I have some years of experience with lucene and I am knowing solr now. I see that many processes are encapsulated in the API. My doubts is on the level of customization of solr. Is it possible to create my units of searches in solr having: 1- Send of seed for ramdomize for my sort criteria What's already in Solr may work for you out-of-the-box... see the random* field in the example schema and the comments with it's associated type. 2- Implement my proper algorithm of suggest. I think that to implement my proper requestHandler and searchComponent. If you want suggest only, implement your own request handler. If you want your suggest output to come back with standard search results, implement it as a search component. -Yonik
Re: ebook resources - including lucene in action
Lest you think silence equals acceptance... This is not appropriate use of these lists. -Grant On Apr 19, 2009, at 11:58 PM, wu fuheng wrote: welcome to download http://www.ultraie.com/admin/flist.php
Re: Big Problem with special characters
Try debugQuery=true and see if the resulting query string makes sense. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Kraus, Ralf | pixelhouse GmbH r...@pixelhouse.de To: solr-user@lucene.apache.org Sent: Monday, April 20, 2009 4:34:36 AM Subject: Big Problem with special characters Hello, first some details about my SOLR installation: schema.xml words=stopwords.txt/ words=stopwords.txt/ Search: qf=name^2.0+name2^1.5+name3 wt=phps rows=30 start=0 sort=score+desc fl=*,score q=speed qt=dismax When I have a string like (speed) in name3 or name2 SOLR dont find it at all :-( If I search for (speed) erverything is fine ! Greets -Ralf-
Does solr directly call underlying lucene functions
Hi, I had made some changes to the lucene code . So i have changes to the index writer,query parser and added some new classes. Would this effect the working of solr in any way. Would i have to make any changes apart from replacing the lucene jar in the war file. I want solr to just use my lucene.jar without any other addons.(i mean analyzers,tokenizers defined in schema.xml) Thanx. -- View this message in context: http://www.nabble.com/Does-solr-directly-call-underlying-lucene-functions-tp23140094p23140094.html Sent from the Solr - User mailing list archive at Nabble.com.
Using a function in a filter query
I want to filter my result set before I search. I know the correct way to do this is by using the filter query (fq) parameter. However, I want to filter based on the output of a function performed on a field. I have a field 'rating' which is an integer in the range of 1 to ~75000. The upper limit may change. I want to filter to the top 500 items with the highest 'rating'. In SQL this would be something like: ... ORDER BY rating DESC LIMIT 500 I think I can get the documents in solr ranked by rating descending by using the function rord(rating), so basically I would like to do: fq=rord(rating):[0 TO 500] But that does not seem possible. Does anyone know what else I could do? -- Pete Smith Senior Developer No.9 | 6 Portal Way | London | W3 6RU | T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 LOVEFiLM.com
Re: ExtractingRequestHandler and SolrRequestHandler issue
Additionally, here's what I've got in example/lib: apache-solr-cell-nightly.jar bcmail-jdk14-132.jar commons-lang-2.1.jar icu4j-3.8.jar log4j-1.2.14.jar poi-3.5-beta5.jar slf4j-api-1.5.5.jar xml-apis-1.0.b2.jar apache-solr-core-nightly.jar bcprov-jdk14-132.jar commons-logging-1.0.4.jar jetty-6.1.3.jar nekohtml-1.9.9.jar poi-ooxml-3.5-beta5.jar slf4j-jdk14-1.5.5.jar xmlbeans-2.3.0.jar apache-solr-solrj-nightly.jar commons-codec-1.3.jar dom4j-1.6.1.jar jetty-util-6.1.3.jar ooxml-schemas-1.0.jar poi-scratchpad-3.5-beta5.jar tika-0.3.jar asm-3.1.jarcommons-io-1.4.jar fontbox-0.1.0-dev.jar jsp-2.1 pdfbox-0.7.3.jar servlet-api-2.5-6.1.3.jar xercesImpl-2.8.1.jar Actually I wasn't very accurate. Following the wiki didn't suffice. I had to add other jars, in order to avoid ClassNotFoundExceptions at startup. These are apache-solr-core-nightly.jar apache-solr-solrj-nightly.jar slf4j-api-1.5.5.jar slf4j-jdk14-1.5.5.jar even while using solr nightly war (in example/webapps). Perhaps something wrong with jar versions? Francisco 2009/4/20 francisco treacy francisco.tre...@gmail.com: Hi Grant, Here is the full stacktrace: 20-Apr-2009 12:36:39 org.apache.solr.common.SolrException log SEVERE: java.lang.ClassCastException: org.apache.solr.handler.extraction.ExtractingRequestHandler cannot be cast to org.apache.solr.request.SolrRequestHandler at org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:154) at org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:163) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:171) at org.apache.solr.core.SolrCore.init(SolrCore.java:535) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Thanks Francisco 2009/4/20 Grant Ingersoll gsing...@apache.org: Can you give the full stack trace? On Apr 20, 2009, at 6:49 AM, francisco treacy wrote: Hi all, I am unsuccessfully attempting to use the ExtractingRequestHandler (indexing documents via Tika, Solr cell). I start Solr from the example app (start.jar), but point to my own Solr conf, where I have requestHandler name=/update/extract class=org.apache.solr.handler.extraction.ExtractingRequestHandler lst name=defaults str name=ext.map.Last-Modifiedlast_modified/str bool name=ext.ignore.und.fltrue/bool /lst /requestHandler Using the nightly builds (2009-04-17). I followed Getting Started with the Solr Example at http://wiki.apache.org/solr/ExtractingRequestHandler, but I got plenty of missing classes. So I had to copy all jars over from example/solr/lib to example/lib. And now, when I fire Jetty (start.jar) I am getting: SEVERE:
maxBooleanClauses implications of a high number ?
I am configuring solr locally for our apps and for some of our apps - we need to configure maxBooleanQueries in the solr configuration. Right now - we had set it to 8K ( as opposed to the default of 1K) . Our dataset document size is about 500K . We have about 6G of ram (totally) - so ignoring the app server + free space required for swap out - I would put the number around 4G for solr doc jvm instance. Given these implications I am trying to figure out how far we can go with (how high ) maxBooleanQueries number since sometimes the boolean queries to be composed seems that long (huge list of terms to be OR-ed). * what are the space implications in terms of memory ( and then possibly disk usage ) * what are the time implications in terms of performance . One of the solutions that I had thought is to split the long boolean query into sub-queries and feeding in multiple queries. again - if we were take that route - what would the time / space considerations .
case insensitive sentence matches in text field
If I have a field that is the default type text (from the sample schema) with the lowercase filter and so forth, is it possible to also do sentence matches in a case insensitive way? I can see the word roots are indexed in lowercase, but when I then try to match on the entire sentence, it will only work if the case is the same. Do I need to create two fields? One for text and the other for case insensitive sentence matching? -- Regards, Ian Connor
Re: Using a function in a filter query
On Mon, Apr 20, 2009 at 12:40 PM, Pete Smith pete.sm...@lovefilm.com wrote: fq=rord(rating):[0 TO 500] Solr 1.4 can now do range queries on arbitrary functions: http://lucene.apache.org/solr/api/org/apache/solr/search/FunctionRangeQParserPlugin.html Note that ord() and rord() won't work properly in Solr 1.4 trunk. Lucene has changed to searching per-segment in a MultiReader and hence you will currently get the ord() or rord() in that segment, not in the whole index. -Yonik http://www.lucidimagination.com
Re: case insensitive sentence matches in text field
On Mon, Apr 20, 2009 at 1:22 PM, Ian Connor ian.con...@gmail.com wrote: If I have a field that is the default type text (from the sample schema) with the lowercase filter and so forth, is it possible to also do sentence matches in a case insensitive way? This should already work... can you add debugQuery=true to the request and show us the resulting debug output? -Yonik http://www.lucidimagination.com
Re: case insensitive sentence matches in text field
Hi, Thanks for the tip - it is in fact working. It is just that the word PubMed trips it up. It splits it up to pub med but if you leave it lowercase, it removes the 'ed' and leaves the root pubm. That is tricky and not what I expected - I will need to be more careful with these filters - thanks. On Mon, Apr 20, 2009 at 1:30 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Mon, Apr 20, 2009 at 1:22 PM, Ian Connor ian.con...@gmail.com wrote: If I have a field that is the default type text (from the sample schema) with the lowercase filter and so forth, is it possible to also do sentence matches in a case insensitive way? This should already work... can you add debugQuery=true to the request and show us the resulting debug output? -Yonik http://www.lucidimagination.com -- Regards, Ian Connor 1 Leighton St #723 Cambridge, MA 02141 Call Center Phone: +1 (714) 239 3875 (24 hrs) Fax: +1(770) 818 5697 Skype: ian.connor
Re: CollapseFilter with the latest Solr in trunk
We would love to help debug the issues but we have limited knowledge in the source code. I have looked through the patch information but I am not understanding the interactions of where the component should be. The example of this is our struggle with the collapse.facet. It does not appear to do anything. We have walked through the component with no success in trying to find where it alters the facets or tells the system to change the counts. We have also moved the collapse component from last to first just to see what it is doing. Pointing us in a direction of how this works would help us more closely understand what the components are doing to each other. requestHandler name=dismax class=solr.DisMaxRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qf productId^10.0 personality^15.0 subCategory^20.0 category^10.0 productType^8.0 brandName^10.0 realBrandName^9.5 productNameSearch^20 size^1.2 width^1.0 heelHeight^1.0 productDescription^5.0 color^6.0 price^1.0 expandedGender^0.5 /str str name=pf brandName^5.0 productNameSearch^5.0 productDescription^5.0 personality^10.0 subCategory^20.0 category^10.0 productType^8.0 /str str name=fl productId, productName, price, originalPrice, brandNameFacet, productRating, imageUrl, productUrl, isNew, onSale /str str name=bfrord(popularity)^1/str str name=mm100%/str int name=ps1/int int name=qs5/int str name=q.alt*:*/str !-- More like this search parameters -- str name=mlt.flbrandNameFacet,productTypeFacet,productName,categoryFacet,subC ategoryFacet,personalityFacet,colorFacet,heelHeight,expandedGender/str int name=mlt.mindf1/int int name=mlt.mintf1/int /lst arr name=last-components strcollapse/str strspellcheck/str /arr /requestHandler -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Ryan McKinley ryan...@gmail.com Reply-To: solr-user@lucene.apache.org Date: Mon, 20 Apr 2009 10:48:19 -0400 To: solr-user@lucene.apache.org Subject: Re: CollapseFilter with the latest Solr in trunk I have not looked at this in a while, but I think the biggest thing it is missing right now is a champion -- someone to get the patches (and bug fixes) to a state where it can easily be committed. Minor bug fixes are road blocks to getting things integrated. ryan On Apr 20, 2009, at 10:16 AM, Jeff Newburn wrote: What are the current issues holding this back? Seems to be working with some minor bug fixes. -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Otis Gospodnetic otis_gospodne...@yahoo.com Reply-To: solr-user@lucene.apache.org Date: Sun, 19 Apr 2009 20:30:22 -0700 (PDT) To: solr-user@lucene.apache.org Subject: Re: CollapseFilter with the latest Solr in trunk Once somebody really makes it work, I'm sure it will be released! Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Antonio Eggberg antonio_eggb...@yahoo.se To: solr-user@lucene.apache.org Sent: Sunday, April 19, 2009 9:21:20 PM Subject: Re: CollapseFilter with the latest Solr in trunk I wish it would be planned for 1.4 :)) --- Den sön 2009-04-19 skrev Otis Gospodnetic : Från: Otis Gospodnetic Ämne: Re: CollapseFilter with the latest Solr in trunk Till: solr-user@lucene.apache.org Datum: söndag 19 april 2009 15.06 Thanks for sharing! It would be good if you (of Jeff from Zappos or anyone making changes to this) could put up a new patch for this most-voted-JIRA-issue. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: climbingrose To: solr-user@lucene.apache.org Sent: Sunday, April 19, 2009 8:12:11 AM Subject: Re: CollapseFilter with the latest Solr in trunk Ok, here is how I fixed this problem: public DocListAndSet getDocListAndSet(Query query, ListfilterList, DocSet docSet, Sort lsort, int offset, int len, int flags) throwsIOException { //DocListAndSet ret = new DocListAndSet(); //getDocListC(ret,query,filterList,docSet,lsort,offset,len, flags |= GET_DOCSET); DocSet theFilt = getDocSet(filterList); if (docSet != null) theFilt = (theFilt != null) ? theFilt.intersection(docSet) : docSet; QueryCommand qc = new QueryCommand(); qc.setQuery(query).setFilter(theFilt);
Re: Solr Search Error
: HTTP Status 500 - 13724 java.lang.ArrayIndexOutOfBoundsException: : 13724 at org.apache.lucene.search.TermScorer.score(TermScorer.java:74) An ArrayIndexOutOfBoundsException from TermScorer is a prtty serious error -- and probably indicates an interal problem of some kind, not a config issue or user error. If i remember correctly, the only times i've ever seen errors like this were either: 1) corrupt indexe ... either the machine or this disk or Solr had a hard crash (OOM in the case of Solr) ... using the CheckIndex tool would probably point out hte problem. 2) hardware glitch ... bad ram or bad disk 3) bug in Lucene ... which is possible if you're using trunk, there have been a lot of little performance tweaks recently. if you can reporduce this type of problem regularly sharing your configs and info about how you build your index can probably help people track down the problem. -Hoss
Re: ebook resources - including lucene in action
It is not legal to share purchased e-books in this manner. Please purchase copies of the books you read, otherwise authors have very little incentive to dedicate months (14 months in the case of Lucene in Action, first edition) of their lives to writing this content. Erik On Apr 20, 2009, at 1:58 AM, Saurabh Bhutyani wrote: Check out this site: www.downloadsearchengine.comIt allows to search and download pdf ebooks, ppts, doc, mp3, torrents, rapidshare links etc. Original message From:wu fuheng wufuh...@gmail.com Date: 20 Apr 09 09:28:56Subject:ebook resources including lucene in actionTo: nutchu...@lucene.apache.orgwelcome to download
query on part number not matching
I have a manufacturer part number: CISCO7204VXR-CH. The indexer produces: 12 3 4 cisco7204vxrch vxrch cisco7204vxrch If I query on CISCO7204VXR-CH, I get: 12 3 4 cisco7204vxrch Everything matches. But if I query on CISCO7204VXRCH, I get 12 3 cisco7204vxrch This does not match on term 3. So, the match fails in this case and returns no results. It seems like it is demanding that every term in the index matches, which doesn't make a whole lot of sense. Should just be the query, right?
Re: Seattle / PNW Hadoop + Lucene User Group?
Thanks for the responses, everyone. Where shall we host? My company can offer space in our building in Factoria, but it's not exactly a 'cool' or 'fun' place. I can also reserve a room at a local library. I can bring some beer and light refreshments. On Mon, Apr 20, 2009 at 7:22 AM, Matthew Hall mh...@informatics.jax.org wrote: Same here, sadly there isn't much call for Lucene user groups in Maine. It would be nice though ^^ Matt Amin Mohammed-Coleman wrote: I would love to come but I'm afraid I'm stuck in rainy old England :( Amin On 18 Apr 2009, at 01:08, Bradford Stephens bradfordsteph...@gmail.com wrote: OK, we've got 3 people... that's enough for a party? :) Surely there must be dozens more of you guys out there... c'mon, accelerate your knowledge! Join us in Seattle! On Thu, Apr 16, 2009 at 3:27 PM, Bradford Stephens bradfordsteph...@gmail.com wrote: Greetings, Would anybody be willing to join a PNW Hadoop and/or Lucene User Group with me in the Seattle area? I can donate some facilities, etc. -- I also always have topics to speak about :) Cheers, Bradford - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: query on part number not matching
Looks like the format didn't come through in the email. ch, vxrch, and cisco7204xvrch are all in position 4. But, your suggestion of turning off catenateAll may work out. I'll have do some testing to make sure that it doesn't have any unintended consequences. Specifically, I am worried about a case like XYZ123-3 and the customer searching on XYZ1233. Ideally, that would produce a match. From: Yonik Seeley yo...@lucidimagination.com To: solr-user@lucene.apache.org Sent: Monday, April 20, 2009 5:14:32 PM Subject: Re: query on part number not matching On Mon, Apr 20, 2009 at 6:59 PM, Kevin Osborn osbo...@yahoo.com wrote: I have a manufacturer part number: CISCO7204VXR-CH. The indexer produces: 12 3 4 cisco7204vxrch vxrch cisco7204vxrch It looks like you're using catenateAll, which doesn't do any good if the query analyzer splits on alpha-numeric transitions. Turn that off to save yourself some space. If I query on CISCO7204VXR-CH, I get: 12 3 4 cisco7204vxrch Everything matches. But if I query on CISCO7204VXRCH, I get 12 3 cisco7204vxrch This does not match on term 3. But it does... The index has vxr, vxrch, and cisco7204vxrch all at position 3. So, the match fails in this case and returns no results. It seems like it is demanding that every term in the index matches, which doesn't make a whole lot of sense. Should just be the query, right? Right. Lucene doesn't really have phrase queries with optional terms in it though. -Yonik
Re: query on part number not matching
On Mon, Apr 20, 2009 at 8:50 PM, Kevin Osborn osbo...@yahoo.com wrote: Looks like the format didn't come through in the email. ch, vxrch, and cisco7204xvrch are all in position 4. Ah... the traditional way to handle that case is to use a little slop with the phrase query. -Yonik
Re: Using Solr to index a database
On Mon, Apr 20, 2009 at 7:15 PM, ahammad ahmed.ham...@gmail.com wrote: Hello, I've never used Solr before, but I believe that it will suit my current needs with indexing information from a database. I downloaded and extracted Solr 1.3 to play around with it. I've been looking at the following tutorials: http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://www.ibm.com/developerworks/java/library/j-solr-update/index.html http://wiki.apache.org/solr/DataImportHandler http://wiki.apache.org/solr/DataImportHandler There are a few things I don't understand. For example, the IBM article sometimes refers to directories that aren't there, or a little different from what I have in my extracted copy of Solr (ie solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can, but as soon as I put the following in solrconfig.xml, the whole thing breaks: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configrss-data-config.xml/str /lst /requestHandler Obviously I replace with my own info...One thing I don't quite get is the data-config.xml file. What exactly is it? I've seen examples of what it contains but since I don't know enough, I couldn't really adjust it. In any case, this is the error I get, which may be because of a misconfigured data-config.xml... the data-config.xml describes how to fetch data from various data sources and index them into Solr. The stacktrace says that your xml is invalid. The best bet is to take one of the sample dataconfig xml files and make changes. http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151view=markup http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151view=markup org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165) at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:99) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by: org.xml.sax.SAXParseException: The element type document must be terminated by the matching end-tag /document. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at