Re: Replication snapshot, tar says file changed as we read it
Sorry to re-open an old thread, but this just happened to me again, even with a 30 second sleep between taking the snapshot and starting to tar it up. Then, even more strangely, the snapshot was removed again before tar completed. Archiving snapshot.20110320113401 into /var/www/mesh/backups/weekly.snapshot.20110320113401.tar.bz2 tar: snapshot.20110320113401/_neqv.fdt: file changed as we read it tar: snapshot.20110320113401/_neqv.prx: File removed before we read it tar: snapshot.20110320113401/_neqv.fnm: File removed before we read it tar: snapshot.20110320113401: Cannot stat: No such file or directory tar: Exiting with failure status due to previous errors Has anybody seen this before, or been able to replicate it themselves? (no pun intended) Or, is anyone else using replication snapshots for backup? Have I misunderstood them? I thought the point of a snapshot was that once taken it was immutable. If it's important, this is on a machine configured as a replication master, but with no slaves attached to it (it's basically a failover and backup machine). requestHandler name=/replication class=solr.ReplicationHandler lst name=master str name=replicateAfterstartup/str str name=replicateAftercommit/str str name=confFilesadmin-extra.html,elevate.xml,protwords.txt,schema.xml,scripts.conf,solrconfig_slave.xml:solrconfig.xml,stopwords.txt,synonyms.txt/str str name=commitReserveDuration00:00:10/str /lst /requestHandler Thanks, Andrew. On 16 January 2011 12:55, Andrew Clegg andrew.cl...@gmail.com wrote: PS one other point I didn't mention is that this server has a very fast autocommit limit (2 seconds max time). But I don't know if this is relevant -- I thought the files in the snapshot wouldn't be committed to again. Please correct me if this is a huge misunderstanding. On 16 January 2011 12:30, Andrew Clegg andrew.cl...@gmail.com wrote: (Many apologies if this appears twice, I tried to send it via Nabble first but it seems to have got stuck, and is fairly urgent/serious.) Hi, I'm trying to use the replication handler to take snapshots, then archive them and ship them off-site. Just now I got a message from tar that worried me: tar: snapshot.20110115035710/_70b.tis: file changed as we read it tar: snapshot.20110115035710: file changed as we read it The relevant bit of script that does it looks like this (error checking removed): curl 'http://localhost:8983/solr/core/1replication?command=backup' PREFIX='' if [[ $START_TIME =~ 'Sun' ]] then PREFIX='weekly.' fi cd $SOLR_DATA_DIR for snapshot in `ls -d -1 snapshot.*` do TARGET=${LOCAL_BACKUP_DIR}/${PREFIX}${snapshot}.tar.bz2 echo Archiving ${snapshot} into $TARGET tar jcf $TARGET $snapshot echo Deleting ${snapshot} rm -rf $snapshot done I was under the impression that files in the snapshot were guaranteed to never change, right? Otherwise what's the point of the replication backup command? I tried putting in a 30-second sleep after the snapshot and before the tar, but the error occurred again anyway. There was a message from Lance N. with a similar error in, years ago: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06104.html but that would be pre-replication anyway, right? This is on Ubuntu 10.10 using java 1.6.0_22 and Solr 1.4.0. Thanks, Andrew. -- :: http://biotext.org.uk/ :: http://twitter.com/andrew_clegg/ :: -- :: http://biotext.org.uk/ :: http://twitter.com/andrew_clegg/ :: -- :: http://biotext.org.uk/ :: http://twitter.com/andrew_clegg/ ::
Re: which German stemmer to use?
In our ActiveMath project, we have had positive feedback in Lucene with the SnowBallAnalyzer(Version.LUCENE_29,German) which is probably one of the two below. I note that you may want to be careful to use one field with exact matching (e.g. whitespace analyzer and lowercase filter) an done field with stemmed matches. That's two fields in the index and a query-expansion mechanism such as dismax to text-de^2.0 text-de.stemmed^1.2 (add the phonetic...) One of the biggest issues that our testers formulated is that compound words should be split. I believe this issue is also very present in technology texts. Thus far only the compound-words analyzer can do such a split and you need the compounds to be manually input. Maybe that's doable? paul Le 24 mars 2011 à 00:14, Christopher Bottaro a écrit : The wiki lists 5 available, but doesn't do a good job at explaining or recommending one: GermanStemFilterFactory SnowballPorterFilterFactory (German) SnowballPorterFilterFactory (German2) GermanLightStemFilterFactory GermanMinimalStemFilterFactory Which is the best one to use in general? Which is the best to use when the content being indexed is German technology articles? Thanks for the help.
Re: Problem with field collapsing of patched Solr 1.4
Afroz Ahmad wrote: Have you enabled the collapse component in solconfig.xml? lt;searchComponent name=quot;queryquot; class=quot;org.apache.solr.handler.component.CollapseComponentquot; /gt; No, it seems that I missed that completely. Thank you, Afroz. It works fine now. Kai -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-with-field-collapsing-of-patched-Solr-1-4-tp2678850p2724321.html Sent from the Solr - User mailing list archive at Nabble.com.
boosting with standard search handler
Hi, is possibile to boost fields like bf parameter of dismax in standard request handler? with or without funcions? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Re: boosting with standard search handler
Hi Gastone, I used to do that in standard search handler using the following parameters: q={!boost b=query($qq,0.7)} text:something title:other qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8 that enabling custom recency based boosting. My 2 cents, Tommaso 2011/3/24 Gastone Penzo gastone.pe...@gmail.com Hi, is possibile to boost fields like bf parameter of dismax in standard request handler? with or without funcions? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Re: boosting with standard search handler
Thank you Tommaso.. your solution works. i read there's another methor, using _val_ parameter. Thank Gastone 2011/3/24 Tommaso Teofili tommaso.teof...@gmail.com Hi Gastone, I used to do that in standard search handler using the following parameters: q={!boost b=query($qq,0.7)} text:something title:other qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8 that enabling custom recency based boosting. My 2 cents, Tommaso 2011/3/24 Gastone Penzo gastone.pe...@gmail.com Hi, is possibile to boost fields like bf parameter of dismax in standard request handler? with or without funcions? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr* -- Gastone Penzo
Re: Why boost query not working?
--- On Thu, 3/24/11, cyang2010 ysxsu...@hotmail.com wrote: This solr query faile: 1. get every title regardless what the title_name is 2. within the result, boost the one which genre id = 56. (bq=genres:56^100) http://localhost:8983/solr/titles/select?indent=onversion=2.2start=0rows=10fl=*%2Cscorewt=standarddefType=dismaxqf=title_name_en_USq=*%3A*bq=genres%3A56^100debugQuery=on But from debug i can tell it confuse the boost query parameter as part of query string: lst name=debug str name=rawquerystring*:* str name=querystring*:* str name=parsedquery+() () genres:56^100.0 str name=parsedquery_toString+() () genres:56^100.0 lst name=explain/ str name=QParserDisMaxQParser null name=altquerystring/ − arr name=boost_queries strgenres:56^100 /arr With dismax, you cannot use semicolon or field queries. Instead of q=*:* you can try q.alt=*:* (do not use q parameter at all)
invert terms in search with exact match
Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. is it possible?? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Re: invert terms in search with exact match
--- On Thu, 3/24/11, Gastone Penzo gastone.pe...@gmail.com wrote: From: Gastone Penzo gastone.pe...@gmail.com Subject: invert terms in search with exact match To: solr-user@lucene.apache.org Date: Thursday, March 24, 2011, 3:58 PM Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. PhraseQuery has an interesting property. If you don't use slop value (means zero) it is ordered phrase query. However starting from 1, it is un-ordered. my love~1 will somehow satisfy you. If really want my love to be unordered you can try solr-1604.
Re: how to run boost query for non-dismax query parser
I need to code some boosting logic when some field equal to some value. I was able to get it work if using dismax query parser. However, since the solr query will need to handle prefix or fuzzy query, therefore, dismax query parser is not really my choice. Therefore, i want to use standard query parser, but still have dismax's boosting query logic. For example, this query return all the titles regardless what the value is, however, will boost the score of those which genres=5237: http://localhost:8983/solr/titles/select?indent=onstart=0rows=10fl=*%2Cscorewt=standardexplainOther=hl.fl=qt=standardq={!boost%20b=genres:5237^2.2}*%3A*debugQuery=on Here is the exception i get: HTTP ERROR: 400 org.apache.lucene.queryParser.ParseException: Expected ',' at position 6 in 'genres:5237^2.2' BoostingQParserPlugin takes a FunctionQuery. In your case it is lucene/solr query. If you want to boost by solr/lucene query, you can add that clause as optional clause. Thats all. q=+*:* genres:5237^2.2q.op=OR will do the trick. Just make sure that you are using OR as a default operator.
Re: invert terms in search with exact match
Hi Gastone, I think you should use proximity search as described here in Lucene query syntax page [1]. So searching for my love~2 should work for your use case. Cheers, Tommaso [1] : http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches 2011/3/24 Gastone Penzo gastone.pe...@gmail.com Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. is it possible?? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Question about http://wiki.apache.org/solr/Deduplication
Hi, Use case I am trying to figure out is about preserving IDs without re-indexing on duplicate, rather adding this new ID under list of document id aliases. Example: Input collection: id:1, text:dummy text 1, signature:A id:2, text:dummy text 1, signature:A I add the first document in empty index, text is going to be indexed, ID is going to be 1, so far so good Now the question, if I add second document with id == 2, instead of deleting/indexing this new document, I would like to store id == 2 in multivalued Field id At the end, I would have one document less indexed and both ID are going to be searchable (and stored as well)... Is it possible in solr to have multivalued id? Or I need to make my own mv_ID for this? Any ideas how to achieve this efficiently? My target is not to add new documents if signature matches, but to have IDs indexed and stored? Thanks, eks
Detecting an empty index during start-up
Hi, In our Solr deployment we have a cluster of replicated Solr cores, with the small change that we have dynamic master look-up using ZooKeeper. The problem I am trying to solve is to make sure that when a new Solr core joins the cluster it isn't made available to any search services until it has been filled with data. I am not familiar with Solr internals, so the approach I wanted to take was to basically check the numDocs property of the index during start-up and set a READABLE state in the ZooKeeper node if it's greater than 0. I also planned to create a commit hook for replication and updating which controlled the READABLE property based on numDocs also. This just leaves the problem of finding out the number of documents during start-up. I planned to have something like: int numDocs = 0; RefCountedSolrIndexSearcher searcher = core.getSearcher(); try { numDocs = searcher.get().getIndexReader().numDocs(); } finally { searcher.decref(); } but getSearcher's documentation specifically says don't use it from the inform method. I missed this at first and of course I got a deadlock (although only when I had more than one core on the same Solr instance). Is there a simpler way to do what I want? Or will I just need to have a thread which waits until the Searcher is available before setting the state? Thanks, David
Re: invert terms in search with exact match
Hi Tommaso, thank you for the answer but the problem in your solution is that solr returns to me also docs with other words. For example: my love is the world i want to exclude the other words. it must give to me only docs with my love or love my. stop Thank you 2011/3/24 Tommaso Teofili tommaso.teof...@gmail.com Hi Gastone, I think you should use proximity search as described here in Lucene query syntax page [1]. So searching for my love~2 should work for your use case. Cheers, Tommaso [1] : http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches 2011/3/24 Gastone Penzo gastone.pe...@gmail.com Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. is it possible?? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr* -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Re: invert terms in search with exact match
Yes create qt with dismax and qf on field that has query stopwords for the words you want to ignore. Bill Bell Sent from mobile On Mar 24, 2011, at 7:58 AM, Gastone Penzo gastone.pe...@gmail.com wrote: Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. is it possible?? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Re: invert terms in search with exact match
no beacuse i don't know the words i want to ignore.. and i don't want use dismax. i have to use standard handler. the problem is very simple. i want to recive only documents that have in title field ONLY the words i search, in any order. if i search my love darling, i want solr returns me these possilbe titles: title1: my love darling title2: my darling love title3: darling my love title4: love my darling . all the combinations of these 3 words. others words have to be ignored thanx 2011/3/24 Bill Bell billnb...@gmail.com Yes create qt with dismax and qf on field that has query stopwords for the words you want to ignore. Bill Bell Sent from mobile On Mar 24, 2011, at 7:58 AM, Gastone Penzo gastone.pe...@gmail.com wrote: Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. is it possible?? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr* -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Re: dismax parser, parens, what do they do exactly
Thanks Hoss, this is very helpful, okay, dismax is not intended to do anything with parens for semantics, they're just like any other char, handled by analyzers. I think you're right I cut and paste the wrong query before. Just for the record, on 1.4.1: qf=text pf= q=book (dog +(cat -frog)) str name=parsedquery +((DisjunctionMaxQuery((text:book)~0.01) DisjunctionMaxQuery((text:dog)~0.01) DisjunctionMaxQuery((text:cat)~0.01) -DisjunctionMaxQuery((text:frog)~0.01))~3) () /str str name=parsedquery_toString +(((text:book)~0.01 (text:dog)~0.01 (text:cat)~0.01 -(text:frog)~0.01)~3) () /str
Re: invert terms in search with exact match
You can use query slop as others have said to find documents with my and love right next to each other, in any order. And I think query slop can probably work for three or more words too to do that. But it won't find files with ONLY those words in it. For instance my love~2 will still match: love my something else something my love else other love my etc. Solr isn't so good at doing exact matches in general, although there are some techniques to set up your index and queries to do actual exact (entire field) matches -- mostly putting fake tokens like $BEGIN and $END at the beginning and end of your indexed values, and then doing a phrase search which puts those tokens at begin and end too. But I'm not sure if you can extend that technique to find exactly the words in _any_ order, instead of just the exact exact phrase. Maybe somehow using phrase slop? It gets confusing to think about, I'm not sure. On 3/24/2011 10:52 AM, Gastone Penzo wrote: no beacuse i don't know the words i want to ignore.. and i don't want use dismax. i have to use standard handler. the problem is very simple. i want to recive only documents that have in title field ONLY the words i search, in any order. if i search my love darling, i want solr returns me these possilbe titles: title1: my love darling title2: my darling love title3: darling my love title4: love my darling . all the combinations of these 3 words. others words have to be ignored thanx 2011/3/24 Bill Bellbillnb...@gmail.com Yes create qt with dismax and qf on field that has query stopwords for the words you want to ignore. Bill Bell Sent from mobile On Mar 24, 2011, at 7:58 AM, Gastone Penzogastone.pe...@gmail.com wrote: Hi, is it possible with standard query search (not dismax) to have exact matches that allow any terms order? for example: if i search my love i would solr gives to me docs with - my love - love my it's easy: q=title:(my AND love) the problem is it returns also docs with my love is my dog i don't want this. i want only docs with title formed by these 2 terms: my and love. is it possible?? thanx -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Multiple Cores with Solr Cell for indexing documents
Hello everyone, I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core. The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents). I never had any problems with Solr Cell when I was using a single core. In fact, I just ran the default installation in example/ and worked from that. However, trying to migrate to multi-core has been a never ending list of problems. Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors. I've copied the exact example/lib directory into the cores, and that doesn't work either. Frankly the only libraries I want are those relevant to indexing files. The less bloat, the better, after all. However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores. Here is an example of the errors I'm receiving: command prompt curl host/solr/core0/update/extract?literal.id=2-3-1commit=true -F myfile=@test2.txt html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 500 /title /head bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 27 more /pre pRequestURI=/solr/core0/update/extract/ppismalla href=http://jetty.mortbay.org/;Powered by Jetty:///a/small/i/pbr/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ /body /html Any assistance you could provide or installation guides/tutorials/etc. that you could link me to would be greatly appreciated. Thank you all for your time! ~Brandon Waterloo
Re: Multiple Cores with Solr Cell for indexing documents
Sounds like the Tika jar is not on the class path. Add it to a directory where Solr's looking for libs. On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote: Hello everyone, I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core. The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents). I never had any problems with Solr Cell when I was using a single core. In fact, I just ran the default installation in example/ and worked from that. However, trying to migrate to multi-core has been a never ending list of problems. Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors. I've copied the exact example/lib directory into the cores, and that doesn't work either. Frankly the only libraries I want are those relevant to indexing files. The less bloat, the better, after all. However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores. Here is an example of the errors I'm receiving: command prompt curl host/solr/core0/update/extract?literal.id=2-3-1commit=true -F myfile=@test2.txt html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 500 /title /head bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java: 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH andler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque st(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav a:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl er.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216 ) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo llection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java: 114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java: 226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java :442) Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 27 more /pre pRequestURI=/solr/core0/update/extract/ppismalla href=http://jetty.mortbay.org/;Powered by Jetty:///a/small/i/pbr/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ br/ /body /html Any assistance you could provide or installation guides/tutorials/etc. that you could link me to would be greatly appreciated. Thank you all for your time! ~Brandon Waterloo -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: invert terms in search with exact match
On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote: title1: my love darling title2: my darling love title3: darling my love title4: love my darling Sorry but simply search for: title:( my OR love OR darling) If you have default operator OR you don't need to put OR on the query Best regards. Dario Rigolin Comperio srl (Italy)
Re: invert terms in search with exact match
yes sorry i made a mistake title(my AND love AND darling) all three words have to match. the problem is always i don't want results with other words. 2011/3/24 Dario Rigolin dario.rigo...@comperio.it On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote: title1: my love darling title2: my darling love title3: darling my love title4: love my darling Sorry but simply search for: title:( my OR love OR darling) If you have default operator OR you don't need to put OR on the query Best regards. Dario Rigolin Comperio srl (Italy) -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.
Hello Solrists, As it says in the subject line, I'm looking for a Java component that, given an ISO 639-1 code or some equivalent, would return a Lucene Analyzer ready to gobble documents in the corresponding language. Solr looks like it has to contain one, only I've not been able to locate it so far; can you point the spot? I've found org.apache.solr.analysis, and thing like org.apache.lucene.analysis.bg c in lucene/modules, with many classes which I'm sure are related, however the factory itself still eludes me; I mean the Java class.method that'd decide on request, what to do with all these packages to bring the requisite object to existence, once the language is specified. Where should I look? Or was I mistaken Solr has nothing of the kind, at least in Java? Thanks in advance for your help. Best regards, François Jurain. Retrouvez les 10 conseils pour économiser votre carburant sur Voila : http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/
Solr throwing exception when evicting from filterCache
I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this error when making a request (with fq's), right at the point where the eviction count goes from 0 up: severe: java.lang.classcastexception: [ljava.lang.object; cannot be cast to [lorg.apache.solr.common.util.concurrentlrucache$cacheentry If you then make another request, Solr response with the expected result. Is this a bug? Has anyone seen this before? Any tips/help/feedback/questions would be much appreciated! Thanks, Matt
Re: Detecting an empty index during start-up
: I am not familiar with Solr internals, so the approach I wanted to take was : to basically check the numDocs property of the index during start-up and set : a READABLE state in the ZooKeeper node if it's greater than 0. I also : planned to create a commit hook for replication and updating which : controlled the READABLE property based on numDocs also. : : This just leaves the problem of finding out the number of documents during : start-up. I planned to have something like: Most of the ZK stuff you mentioned is over my head, but i get the general gist of what you want: * a hook on startup that checks numDocs * if not empty, trigger some logic My suggestion would be to implement this as a firstSearcher SolrEventListener. when that runs, you'll have easy access to a SOlrIndexSearcher (and you won't even have to refcount it) and you can fire whatever logic you want based on what you find when looking at it. -Hoss
Re: how to run boost query for non-dismax query parser
Hi iorixxx, Thanks for your reply. yeah, an additional query with the boost value will work. However, I just wonder where you get the information that BoostQParserPlugin only handles function query? I looked up the javadoc, and still can't get that. This is the javadoc. Create a boosted query from the input value. The main value is the query to be boosted. Other parameters: b, the function query to use as the boost. This just say if b value is specified it is a function query. I just don't understand why dismaxParser has both bf and bq, but for BoostQParserPlugin there is only bf equivalent. Another question is by specifying localParameter like that in query, does it mean to use the default LuceneQParserPlugin primarily and only use BoostQParserPlugin for the content with the {}? Thanks. look forward to your reply, cy -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-run-boost-query-for-non-dismax-query-parser-tp2723442p2726422.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr throwing exception when evicting from filterCache
Here's the full stack trace: [Ljava.lang.Object; cannot be cast to [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; at org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377) at org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329) at org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144) at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131) at org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:613) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:652) at org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1233) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1086) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:337) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:431) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.h On Thu, Mar 24, 2011 at 1:54 PM, Matt Mitchell goodie...@gmail.com wrote: I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this error when making a request (with fq's), right at the point where the eviction count goes from 0 up: severe: java.lang.classcastexception: [ljava.lang.object; cannot be cast to [lorg.apache.solr.common.util.concurrentlrucache$cacheentry If you then make another request, Solr response with the expected result. Is this a bug? Has anyone seen this before? Any tips/help/feedback/questions would be much appreciated! Thanks, Matt
Re: Solr throwing exception when evicting from filterCache
On Thu, Mar 24, 2011 at 1:54 PM, Matt Mitchell goodie...@gmail.com wrote: I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this error when making a request (with fq's), right at the point where the eviction count goes from 0 up: Yep, this was a bug that has since been fixed. -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco
Re: how to run boost query for non-dismax query parser
Thanks for your reply. yeah, an additional query with the boost value will work. However, I just wonder where you get the information that BoostQParserPlugin only handles function query? I looked up the javadoc, and still can't get that. This is the javadoc. Create a boosted query from the input value. The main value is the query to be boosted. Other parameters: b, the function query to use as the boost. This just say if b value is specified it is a function query. As you and wiki said, b is the function query to use as the boost. I just don't understand why dismaxParser has both bf and bq, but for BoostQParserPlugin there is only bf equivalent. I don't know that :) However optional clauses with LuceneQParserPlugin will do the same effect as dismax's bq. Another question is by specifying localParameter like that in query, does it mean to use the default LuceneQParserPlugin primarily and only use BoostQParserPlugin for the content with the {}? Not only for BoostQParserPlugin. http://wiki.apache.org/solr/LocalParams http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams
Fuzzy query using dismax query parser
Hi, I wonder how to conduct fuzzy query using dismax query parser? I am able to do prefix query with local params and prefixQueryParser. But how to handle fuzzy query? I like the behavior of dismax except it does not support the prefix query and fuzzy query. Thanks. cy -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-query-using-dismax-query-parser-tp2727075p2727075.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: invert terms in search with exact match
Then you need to write some custom code for that. Lucene in Action Book (second edition, section 6.3.4) has an example for Translating PhraseQuery to SpanNearQuery. Just use false for the third parameter in SpanNearQuery's ctor. You can plug https://issues.apache.org/jira/browse/SOLR-1604 too. yes sorry i made a mistake title(my AND love AND darling) all three words have to match. the problem is always i don't want results with other words. 2011/3/24 Dario Rigolin dario.rigo...@comperio.it On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote: title1: my love darling title2: my darling love title3: darling my love title4: love my darling Sorry but simply search for: title:( my OR love OR darling) If you have default operator OR you don't need to put OR on the query Best regards. Dario Rigolin Comperio srl (Italy) -- Gastone Penzo *www.solr-italia.it* *The first italian blog about Apache Solr*
Newbie wants to index XML content.
Hello, I've been reading up on how to index XML content but have a few questions. How is data in element attributes handled or defined? How are nested elements handled? In the following XML structure, I want to index the content of what is between the entry tags. In one XML document, there can be up to 100 entry tags. So the entry tag would be equivalent to the doc tag... Can I somehow index this XML as is or will I have to parse it, creating the doc tag and placing all the elements on the same level? Thanks for your help. ?xml version=1.0 encoding=utf-8? root sourcemanual/source author nameMC Anon User/name emailmca...@mcdomain.com/email /author entry name fullnameJohn Smith/fullname /name emailjsmit...@gmail.com/email /entry entry name fullnameFirst Last/fullname firstnameFirst/firstname lastnameLast/lastname /name organization nameMC S.A./name tittleCIO/tittle /organization email type=work primary=truefi...@mcdomain.com/email emailflas...@yahoo.com/email phoneNumber type=work primary=true+5629460600/phoneNumber im carrier=gtalk primary=truefi...@mcdomain.com/im im carrier=skypeFirst.Last/im postalAddress111 Bude St, Toronto/postalAddress custom name=bloghttp://blog.mcdomain.com//custom /entry /root regards Marcelo WebRep Overall rating
RE: Multiple Cores with Solr Cell for indexing documents
Well, there lies the problem--it's not JUST the Tika jar. If it's not one thing, it's another, and I'm not even sure which directory Solr actually looks in. In my Solr.xml file I have it use a shared library folder for every core. Since each core will be holding very homologous data, there's no need to have any different library modules for each. The relevant line in my solr.xml file is solr persistent=true sharedLib=lib. That is housed in .../example/solr/. So, does it look in .../example/lib or .../example/solr/lib? ~Brandon Waterloo From: Markus Jelsma [markus.jel...@openindex.io] Sent: Thursday, March 24, 2011 11:29 AM To: solr-user@lucene.apache.org Cc: Brandon Waterloo Subject: Re: Multiple Cores with Solr Cell for indexing documents Sounds like the Tika jar is not on the class path. Add it to a directory where Solr's looking for libs. On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote: Hello everyone, I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core. The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents). I never had any problems with Solr Cell when I was using a single core. In fact, I just ran the default installation in example/ and worked from that. However, trying to migrate to multi-core has been a never ending list of problems. Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors. I've copied the exact example/lib directory into the cores, and that doesn't work either. Frankly the only libraries I want are those relevant to indexing files. The less bloat, the better, after all. However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores. Here is an example of the errors I'm receiving: command prompt curl host/solr/core0/update/extract?literal.id=2-3-1commit=true -F myfile=@test2.txt html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 500 /title /head bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java: 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH andler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque st(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav a:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl er.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216 ) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo llection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java: 114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java: 226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java :442) Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
Re: Fuzzy query using dismax query parser
I wonder how to conduct fuzzy query using dismax query parser? I am able to do prefix query with local params and prefixQueryParser. But how to handle fuzzy query? I like the behavior of dismax except it does not support the prefix query and fuzzy query. You may interested in https://issues.apache.org/jira/browse/SOLR-1553
Multiple Cores with Solr Cell for indexing documents
Well, there lies the problem--it's not JUST the Tika jar. If it's not one thing, it's another, and I'm not even sure which directory Solr actually looks in. In my Solr.xml file I have it use a shared library folder for every core. Since each core will be holding very homologous data, there's no need to have any different library modules for each. The relevant line in my solr.xml file is solr persistent=true sharedLib=lib. That is housed in .../example/solr/. So, does it look in .../example/lib or .../example/solr/lib? ~Brandon Waterloo From: Markus Jelsma [markus.jel...@openindex.io] Sent: Thursday, March 24, 2011 11:29 AM To: solr-user@lucene.apache.org Cc: Brandon Waterloo Subject: Re: Multiple Cores with Solr Cell for indexing documents Sounds like the Tika jar is not on the class path. Add it to a directory where Solr's looking for libs. On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote: Hello everyone, I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core. The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents). I never had any problems with Solr Cell when I was using a single core. In fact, I just ran the default installation in example/ and worked from that. However, trying to migrate to multi-core has been a never ending list of problems. Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors. I've copied the exact example/lib directory into the cores, and that doesn't work either. Frankly the only libraries I want are those relevant to indexing files. The less bloat, the better, after all. However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores. Here is an example of the errors I'm receiving: command prompt curl host/solr/core0/update/extract?literal.id=2-3-1commit=true -F myfile=@test2.txt html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 500 /title /head bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java: 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH andler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque st(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav a:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl er.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216 ) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo llection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java: 114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java: 226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java :442) Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
Re: how to run boost query for non-dismax query parser
iorixxx, thanks for your reply. Another a little bit off topic question. I looked over all the subclasses of QParserPlugin. It seesm like most of them provide complementary parsing to the default lucene/solr parser. Except prefixParser. What is the intended usage of that one? The default lucene/solr parser is able to parse prefix query. Is the intended usage with dismax parser? -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-run-boost-query-for-non-dismax-query-parser-tp2723442p2727566.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Fuzzy query using dismax query parser
OK, i will have to wait till solr 3 release then. -- View this message in context: http://lucene.472066.n3.nabble.com/Fuzzy-query-using-dismax-query-parser-tp2727075p2727572.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multiple Cores with Solr Cell for indexing documents
I believe it's example/solr/lib where it looks for shared libs in multicore. But, each core can has its own lib dir, usually in core/lib. This is referenced to in solrconfig.xml, see the example config for the lib directive. Well, there lies the problem--it's not JUST the Tika jar. If it's not one thing, it's another, and I'm not even sure which directory Solr actually looks in. In my Solr.xml file I have it use a shared library folder for every core. Since each core will be holding very homologous data, there's no need to have any different library modules for each. The relevant line in my solr.xml file is solr persistent=true sharedLib=lib. That is housed in .../example/solr/. So, does it look in .../example/lib or .../example/solr/lib? ~Brandon Waterloo From: Markus Jelsma [markus.jel...@openindex.io] Sent: Thursday, March 24, 2011 11:29 AM To: solr-user@lucene.apache.org Cc: Brandon Waterloo Subject: Re: Multiple Cores with Solr Cell for indexing documents Sounds like the Tika jar is not on the class path. Add it to a directory where Solr's looking for libs. On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote: Hello everyone, I've been trying for several hours now to set up Solr with multiple cores with Solr Cell working on each core. The only items being indexed are PDF, DOC, and TXT files (with the possibility of expanding this list, but for now, just assume the only things in the index should be documents). I never had any problems with Solr Cell when I was using a single core. In fact, I just ran the default installation in example/ and worked from that. However, trying to migrate to multi-core has been a never ending list of problems. Any time I try to add a document to the index (using the same curl command as I did to add to the single core, of course adding the core name to the request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to classes not being found and/or lazy loading errors. I've copied the exact example/lib directory into the cores, and that doesn't work either. Frankly the only libraries I want are those relevant to indexing files. The less bloat, the better, after all. However, I cannot figure out where to put what files, and why the example installation works perfectly for single-core but not with multi-cores. Here is an example of the errors I'm receiving: command prompt curl host/solr/core0/update/extract?literal.id=2-3-1commit=true -F myfile=@test2.txt html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 500 /title /head bodyh2HTTP ERROR: 500/h2preorg/apache/tika/exception/TikaException java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe dH andler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ e st(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav a :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja v a:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand l er.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21 6 ) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC o llection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java : 114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja v a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java : 226) at
Re: solr on the cloud
Hi, I have tried running the sharded solr with zoo keeper on a single machine. The SOLR code is from current trunk. It runs nicely. Can you please point me to a page, where I can check the status of the solr on the cloud development and available features, apart from http://wiki.apache.org/solr/SolrCloud ? I'm afraid that's the most comprehensive documentation so far. Basically, of high interest is checking out the Map-Reduce for distributed faceting, is it even possible with the trunk? Hm, MR for distributed faceting? Maybe I missed this... can you point to a place that mentions this? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/
Re: [ANNOUNCEMENT] solr-packager 1.0.2 released!
Hi Simone, This is handy! Any chance you'll be adding a version with Jetty 7.* ? Thanks, Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Simone Tripodi simonetrip...@apache.org To: solr-user@lucene.apache.org Sent: Sat, March 19, 2011 8:13:36 PM Subject: [ANNOUNCEMENT] solr-packager 1.0.2 released! Hi all, The Sourcesense's Solr Packager team is pleased to announce the solr-packager-site-1.0.2 release! Solr-Packager is a Maven archetype to package Standalone Apache Solr embedded in Tomcat, brought to you by Sourcesense Changes in this version include: Fixed Bugs: o Custom context root. Issue: 4. o Slave classifier doesn't get installed in M2 local repo. Issue: 5. More informations on http://sourcesense.github.com/solr-packager/ Have fun! - Simone Tripodi, on behalf of Sourcesense http://people.apache.org/~simonetripodi/ http://www.99soft.org/
stopwords not working in multicore setup
Hello, I'm running a Solr server with 5 cores. Three are for English content and two are for German content. The default stopwords setup works fine for the English cores, but the German stopwords aren't working. The German stopwords file is stopwords-de.txt and resides in the same directory as stopwords.txt. The German cores use a different schema (named schema.page.de.xml) which has the following text field definition: http://pastie.org/1711866 The stopwords-de.txt file looks like this: http://pastie.org/1711869 The query I'm doing is this: q = title:für And it's returning documents with für in the title. Title is a text field which should use the stopwords-de.txt, as seen in the aforementioned pastie. Any ideas? Thanks for the help.