Re: SolrCloud - Example C not working
Hmm, nobody has an idea, for everybody the example c is working fine. salu2 On Mon, 2011-02-14 at 14:08 +0100, Thorsten Scherler wrote: Hi all, I followed http://wiki.apache.org/solr/SolrCloud and everything worked fine till I tried Example C:. I start all 4 server but all of them keep looping through: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/127.0.0.1:9983 Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9900 Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:17 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9983 Feb 14, 2011 1:31:17 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:19 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:8574 Feb 14, 2011 1:31:19 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:20 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/127.0.0.1:8574 Feb 14, 2011 1:31:20 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) The problem seems that the zk instances can not connects to the different nodes and so do not get up at all. I am using revision 1070473 for the tests. Anybody has an idea? salu2 -- Thorsten Scherler thorsten.at.apache.org codeBusters S.L. - web based systems consulting, training and solutions http://www.codebusters.es/ smime.p7s Description: S/MIME cryptographic signature
Re: Guidance for event-driven indexing
Solr is multi threaded, so you are free to send as many parallel update requests needed to utilize your HW. Each request will get its own thread. Simply configure StreamingUpdateSolrServer from your client. If there is some lengthy work to be done, it needs to be done in SOME thread, and I guess you just have to choose where :) A JMSUpdateHandler sounds heavy weight, but does not need to be, and might be the logically best place for such a feature imo. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 17.42, Rich Cariens wrote: Thanks Jan, I don't think I want to tie up a thread on two boxes waiting for an UpdateRequestProcessor to finish. I'd prefer to offload it all to the target shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm really just looking for a simple API that allows me to add a SolrInputDocument to the index in-process. Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages? I'm worried that this will break all the nice stuff one gets with the standard SOLR webapp (stats, admin, etc). Best, Rich On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl jan@cominvent.com wrote: Hi, One option would be to keep the JMS listener as today but move the downloading and transforming part to a SolrUpdateRequestProcessor on each shard. The benefit is that you ship only a tiny little SolrInputDocument over the wire with a reference to the doc to download, and do the heavy lifting on Solr side. If each JMS topic/channel corresponds to a particular shard, you could move the whole thing to Solr. If so, a new JMSUpdateHandler could perhaps be a way to go? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 16.53, Rich Cariens wrote: Hello, I've built a system that receives JMS events containing links to docs that I must download and index. Right now the JMS receiving, downloading, and transformation into SolrInputDoc's happens in a separate JVM that then uses Solrj javabin HTTP POSTs to distribute these docs across many index shards. For various reasons I won't go into here, I'd like to relocate/deploy most of my processing (JMS receiving, downloading, and transformation into SolrInputDoc's) into the SOLR webapps running on each distributed shard host. I might be wrong, but I don't think the request-driven idiom of the DataImportHandler is not a good fit for me as I'm not kicking off full or delta imports. If that's true, what's the correct way to hook my components into SOLR's update facilities? Should I try to get a reference a configured DirectUpdateHandler? I don't know if this information helps, but I'll put it out there anyways: I'm using Spring 3 components to receive JMS events, wired up via webapp context hooks. My plan would be to add all that to my SOLR shard webapp. Best, Rich
Re: rollback to other versions of index
Yes and no. The index grows like an onion adding new segments for each commit. There is no API to remove the newly added segments, but I guess you could hack something. The other problem is that as soon as you trigger an optimize() all history is gone as the segments are merged into one. Optimize normally happens automatically behind the scenes. You could turn off merging but that will badly hurt your performance after some time and ultimately crash your OS. Since you only need a few versions back, you COULD write your own custom mergePolicy, always preserving at least N versions. But beware that a version may be ONE document or 1 documents, depending on how you commit or if autoCommit is active. so if you go this route you also need strict control over your commits. Perhaps best option is to handle this on feeding client side, where you keep a buffer of N last docs. Then you can freely roll back or re-index as you choose, based on time, number of docs etc. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 01.21, Tri Nguyen wrote: Hi, Does solr version each index build? We'd like to be able to rollback to not just a previous version but maybe a few version before the current one. Thanks, Tri
Re: carrot2 clustering component error
I've seen that before on a 3.1 check out after i compiled the clustering component, copied the jars and started Solr. For some reason , recompiling didn't work and doing an ant clean in front didn't fix it either. Updating to a revision i knew did work also failed. I just removed the entire checkout and took it back in, repeated my steps and it works fine now. help me out of this error: java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/SolrCoreAware
Solr not Available with Ping when DocBuilder is running
Hello. I do every 2 Minutes a Delta and if one Core (of 7) is running a delta, solr isnt available. when i look in the logFile the ping comes in this time, when DocBuilder is running ... Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Delta Import completed successfully Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:0.15 Feb 15, 2011 11:50:28 AM org.apache.solr.core.SolrCore execute PHP Error at 11:50:12 Error: ... so i get errors, but nothing is wrong for me ... !?!? thx - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-not-Available-with-Ping-when-DocBuilder-is-running-tp2500214p2500214.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr not Available with Ping when DocBuilder is running
And what exactly is your error? and what is your response, for your ping-request? On Tue, Feb 15, 2011 at 12:02 PM, stockii stock.jo...@googlemail.com wrote: Hello. I do every 2 Minutes a Delta and if one Core (of 7) is running a delta, solr isnt available. when i look in the logFile the ping comes in this time, when DocBuilder is running ... Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder doDelta INFO: Delta Import completed successfully Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder execute INFO: Time taken = 0:0:0.15 Feb 15, 2011 11:50:28 AM org.apache.solr.core.SolrCore execute PHP Error at 11:50:12 Error: ... so i get errors, but nothing is wrong for me ... !?!? thx - --- System One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 1 Core with 31 Million Documents other Cores 100.000 - Solr1 for Search-Requests - commit every Minute - 4GB Xmx - Solr2 for Update-Request - delta every 2 Minutes - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-not-Available-with-Ping-when-DocBuilder-is-running-tp2500214p2500214.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Deploying Solr CORES on OVH Cloud
Thanks for your response, but it doesn't help me a whole lot! Jetty VS Tomcat? Ubuntu o Debian? What are the pro of solr using? Le 14/02/2011 23:12, William Bell a écrit : The first two questions are almost like religion. I am not sure we want to start a debate. Core setup is fairly easy. Add a solr.xml file and subdirs one per core (see example/) directory. Make sure you use the right URL for the admin console. On Mon, Feb 14, 2011 at 3:38 AM, Rosa (Anuncios) rosaemailanunc...@gmail.com wrote: Hi, I'm a bit new in Solr. I'm trying to set up a bunch of server (just for solr) on OVH cloud (http://www.ovh.co.uk/cloud/) and create new cores as needed on each server. First question: What do you recommend: Ubuntu or Debian? I mean in term od performance? Second question: Jetty or Tomcat? Again in term of performance and security? Third question: I've followed all the wiki but i can't get it working the CORES... Impossible to create CORE or access my cores? Does anyone have a working config to share? Thanks a lot for your help Regards,
Re: Which version of Solr?
I guess hijacking my own thread is still hijacking. :) I'll avoid that in the future. It is great for SolrJ and Solr to be working as expected and to be making forward progress! Jeff On Feb 14, 2011, at 11:01 PM, David Smiley (@MITRE.org) wrote: Wow; I'm glad you figured it out -- sort of. FYI, in the future, don't hijack email threads to talk about a new subject. Start a new thread. ~ David p.s. yes, I'm working on the 2nd edition. - Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Which-version-of-Solr-tp2482468p2498641.html Sent from the Solr - User mailing list archive at Nabble.com. -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 http://www.535consulting.com
Re: Which version of Solr?
Hi Otis: I guess I got too obsessed trying to resolve my SolrJ/Solr interaction problem, I missed your reply... I've heard using 3.1 is the best approach, and now 4.0/trunk. Will trunk be undergoing a release in the next few months then? It seems so soon after 3.x. Fortunately, I have both branch_3x and trunk checked out and I can generate Maven artifacts for each one. That makes it easy for me to use one or the other, at least until I get set on some feature only available in one of them. Is trunk currently a superset of branch_3x, or are there some 3.x features that won't be merged into trunk for quite some time? Cheers, Jeff On Feb 13, 2011, at 6:49 PM, Otis Gospodnetic wrote: Hi Jeff, For projects that are going live in 6 months I would use trunk. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Jeff Schmidt j...@535consulting.com To: solr-user@lucene.apache.org Sent: Sat, February 12, 2011 4:37:37 PM Subject: Which version of Solr? Hello: I'm working on incorporating Solr into a SaaS based life sciences semantic search project. This will be released in about six months. I'm trying to determine which version of Solr makes the most sense. When going to the Solr download page, there are 1.3.0, 1.4.0, and 1.4.1. I've been using 1.4.1 while going through some examples in my Packt book (Solr 1.4 Enterprise Search Server). But, I also see that Solr 3.1 and 4.0 are in the works. According to: https://issues.apache.org/jira/browse/#selectedTab=com.atlassian.jira.plugin.system.project%3Aroadmap-panel there is a high degree of progress on both of those releases; including a slew of bug fixes, new features, performance enhancements etc. Should I be making use of one of the newer versions? The hierarchical faceting seems like it could be quite useful. Are there any guesses on when either 3.1 or 4.0 will be officially released? So far, 1.4.1 has been good. But I'm unable to get SolrJ to work due to the 'javabin' version mismatch. I'm using the 1.4.1 version of SolrJ, but I always get an HTTP response code of 200, but the return entity is simply a null byte, which does not match the version number of 1 defined in Solr common. Anyway, I can follow up on that issue if 1.4.1 is still the most appropriate version to use these days. Otherwise, I'll try again with whatever version you suggest. Thanks a lot! Jeff -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 http://www.535consulting.com
Re: Guidance for event-driven indexing
Thanks Jan. For the JMSUpdateHandler option, how does one plugin a custom UpdateHandler? I want to make sure I'm not missing any important pieces of Solr processing pipeline. Best, Rich On Tue, Feb 15, 2011 at 4:36 AM, Jan Høydahl jan@cominvent.com wrote: Solr is multi threaded, so you are free to send as many parallel update requests needed to utilize your HW. Each request will get its own thread. Simply configure StreamingUpdateSolrServer from your client. If there is some lengthy work to be done, it needs to be done in SOME thread, and I guess you just have to choose where :) A JMSUpdateHandler sounds heavy weight, but does not need to be, and might be the logically best place for such a feature imo. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 17.42, Rich Cariens wrote: Thanks Jan, I don't think I want to tie up a thread on two boxes waiting for an UpdateRequestProcessor to finish. I'd prefer to offload it all to the target shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm really just looking for a simple API that allows me to add a SolrInputDocument to the index in-process. Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages? I'm worried that this will break all the nice stuff one gets with the standard SOLR webapp (stats, admin, etc). Best, Rich On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl jan@cominvent.com wrote: Hi, One option would be to keep the JMS listener as today but move the downloading and transforming part to a SolrUpdateRequestProcessor on each shard. The benefit is that you ship only a tiny little SolrInputDocument over the wire with a reference to the doc to download, and do the heavy lifting on Solr side. If each JMS topic/channel corresponds to a particular shard, you could move the whole thing to Solr. If so, a new JMSUpdateHandler could perhaps be a way to go? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 16.53, Rich Cariens wrote: Hello, I've built a system that receives JMS events containing links to docs that I must download and index. Right now the JMS receiving, downloading, and transformation into SolrInputDoc's happens in a separate JVM that then uses Solrj javabin HTTP POSTs to distribute these docs across many index shards. For various reasons I won't go into here, I'd like to relocate/deploy most of my processing (JMS receiving, downloading, and transformation into SolrInputDoc's) into the SOLR webapps running on each distributed shard host. I might be wrong, but I don't think the request-driven idiom of the DataImportHandler is not a good fit for me as I'm not kicking off full or delta imports. If that's true, what's the correct way to hook my components into SOLR's update facilities? Should I try to get a reference a configured DirectUpdateHandler? I don't know if this information helps, but I'll put it out there anyways: I'm using Spring 3 components to receive JMS events, wired up via webapp context hooks. My plan would be to add all that to my SOLR shard webapp. Best, Rich
Dismax problem
Hi, im having a problem while trying to do a dismax search. For example i have the standard query url like this: It returns 1 result. But when i try to use the dismax query type i have the following error: 15/02/2011 10:27:07 org.apache.solr.common.SolrException log GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28 at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224) at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692) at org.apache.solr.search.function.StringIndexDocValues.init(StringIndexDocValues.java:35) at org.apache.solr.search.function.OrdFieldSource$1.init(OrdFieldSource.java:84) at org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58) at org.apache.solr.search.function.FunctionQuery$AllScorer.init(FunctionQuery.java:123) at org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579) at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) The Solr instance is running as a replication slave. This is the solrconfig.xml: http://pastebin.com/GSv2wBB4 This is the schema.xml: http://pastebin.com/5VpRT5Jj Any help? How can i find what is causing this exception? I thought that the dismax didn't throw exceptions... -- __ Ezequiel. Http://www.ironicnet.com
Re: Solr 1.4 requestHandler update Runtime disable/enable
Ok. I can set it so: requestHandler name=/update class=solr.XmlUpdateRequestHandler enable=${solr.enable.master:true}/ But how without restarting Tomcat can I change solr.enable.master from true to false ?? But I don't see why you need to disable it. You will anyway need to stop sending updates to the old master yourself. Disabling the handler like this will cause an exception if you try to call it because it will not be registered. I have a buffer in the application. So I do not have it off. I have to switch automatically, without administrator intervention. Application is switching from one location to other and solr must automatically follow the application. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-1-4-requestHandler-update-Runtime-disable-enable-tp2493745p2500603.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Guidance for event-driven indexing
Hi, You would wire your JMSUpdateRequestHandler into solrconfig.xml as normal, and if you want to apply an UpdateChain, that would look like this: requestHandler name=/update/jms class=solr.JmsUpdateRequestHandler lst name=defaults str name=update.processormyPipeline/str /lst /requestHandler See http://wiki.apache.org/solr/SolrRequestHandler for details -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.30, Rich Cariens wrote: Thanks Jan. For the JMSUpdateHandler option, how does one plugin a custom UpdateHandler? I want to make sure I'm not missing any important pieces of Solr processing pipeline. Best, Rich On Tue, Feb 15, 2011 at 4:36 AM, Jan Høydahl jan@cominvent.com wrote: Solr is multi threaded, so you are free to send as many parallel update requests needed to utilize your HW. Each request will get its own thread. Simply configure StreamingUpdateSolrServer from your client. If there is some lengthy work to be done, it needs to be done in SOME thread, and I guess you just have to choose where :) A JMSUpdateHandler sounds heavy weight, but does not need to be, and might be the logically best place for such a feature imo. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 17.42, Rich Cariens wrote: Thanks Jan, I don't think I want to tie up a thread on two boxes waiting for an UpdateRequestProcessor to finish. I'd prefer to offload it all to the target shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm really just looking for a simple API that allows me to add a SolrInputDocument to the index in-process. Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages? I'm worried that this will break all the nice stuff one gets with the standard SOLR webapp (stats, admin, etc). Best, Rich On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl jan@cominvent.com wrote: Hi, One option would be to keep the JMS listener as today but move the downloading and transforming part to a SolrUpdateRequestProcessor on each shard. The benefit is that you ship only a tiny little SolrInputDocument over the wire with a reference to the doc to download, and do the heavy lifting on Solr side. If each JMS topic/channel corresponds to a particular shard, you could move the whole thing to Solr. If so, a new JMSUpdateHandler could perhaps be a way to go? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 16.53, Rich Cariens wrote: Hello, I've built a system that receives JMS events containing links to docs that I must download and index. Right now the JMS receiving, downloading, and transformation into SolrInputDoc's happens in a separate JVM that then uses Solrj javabin HTTP POSTs to distribute these docs across many index shards. For various reasons I won't go into here, I'd like to relocate/deploy most of my processing (JMS receiving, downloading, and transformation into SolrInputDoc's) into the SOLR webapps running on each distributed shard host. I might be wrong, but I don't think the request-driven idiom of the DataImportHandler is not a good fit for me as I'm not kicking off full or delta imports. If that's true, what's the correct way to hook my components into SOLR's update facilities? Should I try to get a reference a configured DirectUpdateHandler? I don't know if this information helps, but I'll put it out there anyways: I'm using Spring 3 components to receive JMS events, wired up via webapp context hooks. My plan would be to add all that to my SOLR shard webapp. Best, Rich
very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: Solr 1.4 requestHandler update Runtime disable/enable
Ok. I can set it so: requestHandler name=/update class=solr.XmlUpdateRequestHandler enable=${solr.enable.master:true}/ But how without restarting Tomcat can I change solr.enable.master from true to false ?? That's an exercise left to the reader :) Honestly, I don't think you need to. Why would you? The handler does not do anything if never called. But I don't see why you need to disable it. You will anyway need to stop sending updates to the old master yourself. Disabling the handler like this will cause an exception if you try to call it because it will not be registered. I have a buffer in the application. So I do not have it off. I have to switch automatically, without administrator intervention. Application is switching from one location to other and solr must automatically follow the application. One Solr server does not know about the other, so you do not need to switch anything on the Solr side. You simply need to design your client in such a way that it handles the operations in correct order and timing, i.e. that it pauses all feeding until replication is done, then feeds to the new master instead of the old. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com
Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
http://wiki.apache.org/solr/SolrQuerySyntax Examples: q=myfield:(xyz OR abc) q={!lucene q.op=OR df=myfield}xyz abc q=xyz OR abcdefType=edismaxqf=myfield PS: If using type=string, you will not match individual words inside the field, only an exact case sensitive match of whole field. Use some variant of text if this is not what you want. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote: Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
Hi Jan, Thanks for reply. I have tried the first variation in your example (and again after reading your reply). It returns no results! Note: it is not a multivalued field, I think when you use example 1 below, it looks for both xyz and abc in same field for same document, what i'm trying to get are all records that match either of the two. I hope I am making sense. Thanks, Ravish On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl jan@cominvent.com wrote: http://wiki.apache.org/solr/SolrQuerySyntax Examples: q=myfield:(xyz OR abc) q={!lucene q.op=OR df=myfield}xyz abc q=xyz OR abcdefType=edismaxqf=myfield PS: If using type=string, you will not match individual words inside the field, only an exact case sensitive match of whole field. Use some variant of text if this is not what you want. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote: Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
The OR implies that all documents matching either one of the two terms shold be returned. Are you sure you are searching with correct uppercase/lowercase, as string fields are case sensitive? To further help you, we need copies of relevant sections of your schema and an exact copy of the query string you attempt to run, as well as proof that the documents exist. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.54, Ravish Bhagdev wrote: Hi Jan, Thanks for reply. I have tried the first variation in your example (and again after reading your reply). It returns no results! Note: it is not a multivalued field, I think when you use example 1 below, it looks for both xyz and abc in same field for same document, what i'm trying to get are all records that match either of the two. I hope I am making sense. Thanks, Ravish On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl jan@cominvent.com wrote: http://wiki.apache.org/solr/SolrQuerySyntax Examples: q=myfield:(xyz OR abc) q={!lucene q.op=OR df=myfield}xyz abc q=xyz OR abcdefType=edismaxqf=myfield PS: If using type=string, you will not match individual words inside the field, only an exact case sensitive match of whole field. Use some variant of text if this is not what you want. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote: Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
Arghhh.. I think its the regexp parser messing things up (just looked at the debugQuery ouput and its parsing incorrectly some / kind of letters I had. I think I can clean up the data off these characters or maybe there is a way to escape them... Ravish On Tue, Feb 15, 2011 at 1:54 PM, Ravish Bhagdev ravish.bhag...@gmail.comwrote: Hi Jan, Thanks for reply. I have tried the first variation in your example (and again after reading your reply). It returns no results! Note: it is not a multivalued field, I think when you use example 1 below, it looks for both xyz and abc in same field for same document, what i'm trying to get are all records that match either of the two. I hope I am making sense. Thanks, Ravish On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl jan@cominvent.comwrote: http://wiki.apache.org/solr/SolrQuerySyntax Examples: q=myfield:(xyz OR abc) q={!lucene q.op=OR df=myfield}xyz abc q=xyz OR abcdefType=edismaxqf=myfield PS: If using type=string, you will not match individual words inside the field, only an exact case sensitive match of whole field. Use some variant of text if this is not what you want. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote: Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: schema.xml configuration for file names?
Can we see a small sample of an xml file you're posting? Because it should look something like add doc field name=stbmodelR16-500/field more fields here. /doc /add Take a look at the Solr admin page after you've indexed data to see what's actually in your index, I suspect what's in there isn't what you expect. Try querying q=*:* just for yucks to see what the documents returned look like. I suspect your index doesn't contain anything like what you think, but that's only a guess... Best Erick On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison kg6...@gmail.com wrote: Hello! We receive from our suppliers hardware manufacturing data in XML files. On a typical day, we got 25,000 files. That is why I chose to implement Solr. The file names are made of eleven fields separated by tildas like so CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML Our RD guys want to be able search each field of the file XML file names (OR operation) but they don't care to search the file contents. Ideally, they would like to do a query all files where stbmodel equal to R16-500 or result is P or filedate is 20110125...you get the idea. I defined in schema.xml each data field like so (from left to right -- sorry for the long list): field name=location type=textgen indexed=false stored=true multiValued=false/ field name=scriptid type=textgen indexed=false stored=true multiValued=false/ field name=slotid type=textgen indexed=false stored=true multiValued=false/ field name=workcenter type=textgen indexed=false stored=false multiValued=false/ field name=workcenterid type=textgen indexed=false stored=fase multiValued=false/ field name=result type=string indexed=true stored=true multiValued=false/ field name=computerid type=textgen indexed=false stored=true multiValued=false/ field name=stbmodel type=textgen indexed=true stored=true multiValued=false/ field name=receiver type=string indexed=true stored=true multiValued=false/ field name=filedate type=textgen indexed=false stored=true multiValued=false/ field name=filetime type=textgen indexed=false stored=true multiValued=false/ Also, I defined as unique key the field receiver. But no results are returned by my queries. I made sure to update my index like so: java -jar apache-solr-1.4.1/example/exampledocs/post.jar *XML. I am obviously missing something. Is there a way to configure schema.xml to search for file names? I welcome your input. Al.
Re: Deploying Solr CORES on OVH Cloud
The usual answer is whatever you're most comfortable/experienced with. From my perspective, there's enough to learn getting Solr running and understanding how search works without throwing new environments into the mix... So, I'd pick the one you're most familiar with and use that. If you're not familiar with either, flip a coin G... This isn't all that helpful either, but all that means is that this is a question that doesn't have a one-recommendation-fits-all answer. Best Erick On Tue, Feb 15, 2011 at 6:08 AM, Rosa (Anuncios) rosaemailanunc...@gmail.com wrote: Thanks for your response, but it doesn't help me a whole lot! Jetty VS Tomcat? Ubuntu o Debian? What are the pro of solr using? Le 14/02/2011 23:12, William Bell a écrit : The first two questions are almost like religion. I am not sure we want to start a debate. Core setup is fairly easy. Add a solr.xml file and subdirs one per core (see example/) directory. Make sure you use the right URL for the admin console. On Mon, Feb 14, 2011 at 3:38 AM, Rosa (Anuncios) rosaemailanunc...@gmail.com wrote: Hi, I'm a bit new in Solr. I'm trying to set up a bunch of server (just for solr) on OVH cloud (http://www.ovh.co.uk/cloud/) and create new cores as needed on each server. First question: What do you recommend: Ubuntu or Debian? I mean in term od performance? Second question: Jetty or Tomcat? Again in term of performance and security? Third question: I've followed all the wiki but i can't get it working the CORES... Impossible to create CORE or access my cores? Does anyone have a working config to share? Thanks a lot for your help Regards,
Re: Which version of Solr?
Let's see if I have this right. 3x is 1.4.1 with selected features from trunk backported. Which translates as lots of cool new stuff is in in 3x (geospatial comes to mind, eDismax, etc) but the more fluid changes are not being backported. I guess it depends on how risk-averse you are. There are people using both trunk and 3x in production. Personally, though, I'd go with 3x for a project 6 months out unless there's a feature of trunk that would make your life a whole lot easier. Trunk is well- tested, but why take any risk unless there are measurable benefits? You might read through the changes.txt file to see if there's anything in trunk you can't live without Best Erick On Tue, Feb 15, 2011 at 7:30 AM, Jeff Schmidt j...@535consulting.com wrote: Hi Otis: I guess I got too obsessed trying to resolve my SolrJ/Solr interaction problem, I missed your reply... I've heard using 3.1 is the best approach, and now 4.0/trunk. Will trunk be undergoing a release in the next few months then? It seems so soon after 3.x. Fortunately, I have both branch_3x and trunk checked out and I can generate Maven artifacts for each one. That makes it easy for me to use one or the other, at least until I get set on some feature only available in one of them. Is trunk currently a superset of branch_3x, or are there some 3.x features that won't be merged into trunk for quite some time? Cheers, Jeff On Feb 13, 2011, at 6:49 PM, Otis Gospodnetic wrote: Hi Jeff, For projects that are going live in 6 months I would use trunk. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Jeff Schmidt j...@535consulting.com To: solr-user@lucene.apache.org Sent: Sat, February 12, 2011 4:37:37 PM Subject: Which version of Solr? Hello: I'm working on incorporating Solr into a SaaS based life sciences semantic search project. This will be released in about six months. I'm trying to determine which version of Solr makes the most sense. When going to the Solr download page, there are 1.3.0, 1.4.0, and 1.4.1. I've been using 1.4.1 while going through some examples in my Packt book (Solr 1.4 Enterprise Search Server). But, I also see that Solr 3.1 and 4.0 are in the works. According to: https://issues.apache.org/jira/browse/#selectedTab=com.atlassian.jira.plugin.system.project%3Aroadmap-panel there is a high degree of progress on both of those releases; including a slew of bug fixes, new features, performance enhancements etc. Should I be making use of one of the newer versions? The hierarchical faceting seems like it could be quite useful. Are there any guesses on when either 3.1 or 4.0 will be officially released? So far, 1.4.1 has been good. But I'm unable to get SolrJ to work due to the 'javabin' version mismatch. I'm using the 1.4.1 version of SolrJ, but I always get an HTTP response code of 200, but the return entity is simply a null byte, which does not match the version number of 1 defined in Solr common. Anyway, I can follow up on that issue if 1.4.1 is still the most appropriate version to use these days. Otherwise, I'll try again with whatever version you suggest. Thanks a lot! Jeff -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 http://www.535consulting.com
Re: Which version of Solr?
On Tue, Feb 15, 2011 at 9:18 AM, Erick Erickson erickerick...@gmail.com wrote: I guess it depends on how risk-averse you are. There are people using both trunk and 3x in production. Right. It also depends on how easy it is to re-index your data. If it's hard/impossible, IMO that's the single biggest argument for going with 3x (soon 3.1) instead of trunk. All of the new coolness in trunk has come with index format changes along the way. -Yonik http://lucidimagination.com
Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
You might look at the analysis page from the admin console for the field in question, it'll show you what various parts of the analysis chain do. But I agree with Jan, having your field as a string type is a red flag. This field is NOT analyzed, parsed, or filtered. For instance, if a doc has a value for the field of: [My life], only [My life] will match. Not [my], not [life], not even [my life] (ignore all brackets, but quotes are often confused with phrases). It may well be that this is the exact behavior you want, but this is often a point of confusion. Best Erick On Tue, Feb 15, 2011 at 9:00 AM, Ravish Bhagdev ravish.bhag...@gmail.com wrote: Arghhh.. I think its the regexp parser messing things up (just looked at the debugQuery ouput and its parsing incorrectly some / kind of letters I had. I think I can clean up the data off these characters or maybe there is a way to escape them... Ravish On Tue, Feb 15, 2011 at 1:54 PM, Ravish Bhagdev ravish.bhag...@gmail.comwrote: Hi Jan, Thanks for reply. I have tried the first variation in your example (and again after reading your reply). It returns no results! Note: it is not a multivalued field, I think when you use example 1 below, it looks for both xyz and abc in same field for same document, what i'm trying to get are all records that match either of the two. I hope I am making sense. Thanks, Ravish On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl jan@cominvent.comwrote: http://wiki.apache.org/solr/SolrQuerySyntax Examples: q=myfield:(xyz OR abc) q={!lucene q.op=OR df=myfield}xyz abc q=xyz OR abcdefType=edismaxqf=myfield PS: If using type=string, you will not match individual words inside the field, only an exact case sensitive match of whole field. Use some variant of text if this is not what you want. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote: Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: Dismax problem
Hi Ezequiel, The standard query parser works with all the fields you are using with dismax? Did you change the schema in some way? What version of Solr are you on? Tomás On Tue, Feb 15, 2011 at 10:34 AM, Ezequiel Calderara ezech...@gmail.comwrote: Hi, im having a problem while trying to do a dismax search. For example i have the standard query url like this: It returns 1 result. But when i try to use the dismax query type i have the following error: 15/02/2011 10:27:07 org.apache.solr.common.SolrException log GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28 at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224) at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692) at org.apache.solr.search.function.StringIndexDocValues.init(StringIndexDocValues.java:35) at org.apache.solr.search.function.OrdFieldSource$1.init(OrdFieldSource.java:84) at org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58) at org.apache.solr.search.function.FunctionQuery$AllScorer.init(FunctionQuery.java:123) at org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93) at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297) at org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579) at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) The Solr instance is running as a replication slave. This is the solrconfig.xml: http://pastebin.com/GSv2wBB4 This is the schema.xml: http://pastebin.com/5VpRT5Jj Any help? How can i find what is causing this exception? I thought that the dismax didn't throw exceptions... -- __ Ezequiel. Http://www.ironicnet.com
Re: SolrCloud - Example C not working
On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler thors...@apache.org wrote: Hi all, I followed http://wiki.apache.org/solr/SolrCloud and everything worked fine till I tried Example C:. Verified. I just tried and it failed for me too. -Yonik http://lucidimagination.com I start all 4 server but all of them keep looping through: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/127.0.0.1:9983 Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9900 Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:17 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9983 Feb 14, 2011 1:31:17 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:19 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:8574 Feb 14, 2011 1:31:19 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) Feb 14, 2011 1:31:20 PM org.apache.log4j.Category info INFO: Opening socket connection to server localhost/127.0.0.1:8574 Feb 14, 2011 1:31:20 PM org.apache.log4j.Category warn WARNING: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:1078) The problem seems that the zk instances can not connects to the different nodes and so do not get up at all. I am using revision 1070473 for the tests. Anybody has an idea? salu2 -- Thorsten Scherler thorsten.at.apache.org codeBusters S.L. - web based systems consulting, training and solutions http://www.codebusters.es/
Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....
Hi Erick, I've managed to fix the problem, it was to do with not encoding certain characters. Escaped with \ and it all works fine now :) . Sorry I was just being insane, look at debugQuery output helped. I know about the string field, this is kind of a uuid field that I am storing, so it it desired that it always be exact match, so I am being careful about why I chose that type. I am going to start looking at all that is available as Analyzer soon, something that does string distance match would be cool Ravish On Tue, Feb 15, 2011 at 2:30 PM, Erick Erickson erickerick...@gmail.comwrote: You might look at the analysis page from the admin console for the field in question, it'll show you what various parts of the analysis chain do. But I agree with Jan, having your field as a string type is a red flag. This field is NOT analyzed, parsed, or filtered. For instance, if a doc has a value for the field of: [My life], only [My life] will match. Not [my], not [life], not even [my life] (ignore all brackets, but quotes are often confused with phrases). It may well be that this is the exact behavior you want, but this is often a point of confusion. Best Erick On Tue, Feb 15, 2011 at 9:00 AM, Ravish Bhagdev ravish.bhag...@gmail.com wrote: Arghhh.. I think its the regexp parser messing things up (just looked at the debugQuery ouput and its parsing incorrectly some / kind of letters I had. I think I can clean up the data off these characters or maybe there is a way to escape them... Ravish On Tue, Feb 15, 2011 at 1:54 PM, Ravish Bhagdev ravish.bhag...@gmail.comwrote: Hi Jan, Thanks for reply. I have tried the first variation in your example (and again after reading your reply). It returns no results! Note: it is not a multivalued field, I think when you use example 1 below, it looks for both xyz and abc in same field for same document, what i'm trying to get are all records that match either of the two. I hope I am making sense. Thanks, Ravish On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl jan@cominvent.com wrote: http://wiki.apache.org/solr/SolrQuerySyntax Examples: q=myfield:(xyz OR abc) q={!lucene q.op=OR df=myfield}xyz abc q=xyz OR abcdefType=edismaxqf=myfield PS: If using type=string, you will not match individual words inside the field, only an exact case sensitive match of whole field. Use some variant of text if this is not what you want. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote: Hi Guys, I've been trying various combinations but unable to perform a OR query for a specific field in my solr schema. I have a string field called myfield and I want to return all documents that have this field which either matches abc or xyz So all records that have myfield=abc and all records that have myfield=xyz should be returned (union) What should my query be? I have tried (myfield=abc OR myfield=xyz) which works, but only returns all the documents that contain xyz in that field, which I find quite weird. I have tried running this as fq query as well but same result! It is such a simple thing but I can't find right syntax after going through a lot of documentation and searching. Will appreciate any quick reply or examples, thanks very much. Ravish
Re: Which version of Solr?
I guess I'll work with 3.x for now until some 4.0 feature makes me move to trunk. For the next few months, re-indexing is not a problem, but once in production one index directly under my control will be updated quarterly (maybe monthly) with new content, while other indexes will be updated by 3rd parties at arbitrary times. Those indexes will maintain cumulative results of those updates and it will be more of an issue to require a 3rd party to provide the totality of documents to re-index from scratch. Not impossible, just not desirable. Once I get more comfortable with Solr as a solution, I need to look more into index replication, backup etc. :) Thanks for your suggestions on the versions. Cheers, Jeff On Feb 15, 2011, at 7:23 AM, Yonik Seeley wrote: On Tue, Feb 15, 2011 at 9:18 AM, Erick Erickson erickerick...@gmail.com wrote: I guess it depends on how risk-averse you are. There are people using both trunk and 3x in production. Right. It also depends on how easy it is to re-index your data. If it's hard/impossible, IMO that's the single biggest argument for going with 3x (soon 3.1) instead of trunk. All of the new coolness in trunk has come with index format changes along the way. -Yonik http://lucidimagination.com -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 http://www.535consulting.com
Re: Guidance for event-driven indexing
Thanks Jan, I was referring to a custom UpdateHandler, not a RequestHandler. You know, the one that the Solr wiki discourages :). Best, Rich On Tue, Feb 15, 2011 at 8:37 AM, Jan Høydahl jan@cominvent.com wrote: Hi, You would wire your JMSUpdateRequestHandler into solrconfig.xml as normal, and if you want to apply an UpdateChain, that would look like this: requestHandler name=/update/jms class=solr.JmsUpdateRequestHandler lst name=defaults str name=update.processormyPipeline/str /lst /requestHandler See http://wiki.apache.org/solr/SolrRequestHandler for details -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 14.30, Rich Cariens wrote: Thanks Jan. For the JMSUpdateHandler option, how does one plugin a custom UpdateHandler? I want to make sure I'm not missing any important pieces of Solr processing pipeline. Best, Rich On Tue, Feb 15, 2011 at 4:36 AM, Jan Høydahl jan@cominvent.com wrote: Solr is multi threaded, so you are free to send as many parallel update requests needed to utilize your HW. Each request will get its own thread. Simply configure StreamingUpdateSolrServer from your client. If there is some lengthy work to be done, it needs to be done in SOME thread, and I guess you just have to choose where :) A JMSUpdateHandler sounds heavy weight, but does not need to be, and might be the logically best place for such a feature imo. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 17.42, Rich Cariens wrote: Thanks Jan, I don't think I want to tie up a thread on two boxes waiting for an UpdateRequestProcessor to finish. I'd prefer to offload it all to the target shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm really just looking for a simple API that allows me to add a SolrInputDocument to the index in-process. Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages? I'm worried that this will break all the nice stuff one gets with the standard SOLR webapp (stats, admin, etc). Best, Rich On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl jan@cominvent.com wrote: Hi, One option would be to keep the JMS listener as today but move the downloading and transforming part to a SolrUpdateRequestProcessor on each shard. The benefit is that you ship only a tiny little SolrInputDocument over the wire with a reference to the doc to download, and do the heavy lifting on Solr side. If each JMS topic/channel corresponds to a particular shard, you could move the whole thing to Solr. If so, a new JMSUpdateHandler could perhaps be a way to go? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 14. feb. 2011, at 16.53, Rich Cariens wrote: Hello, I've built a system that receives JMS events containing links to docs that I must download and index. Right now the JMS receiving, downloading, and transformation into SolrInputDoc's happens in a separate JVM that then uses Solrj javabin HTTP POSTs to distribute these docs across many index shards. For various reasons I won't go into here, I'd like to relocate/deploy most of my processing (JMS receiving, downloading, and transformation into SolrInputDoc's) into the SOLR webapps running on each distributed shard host. I might be wrong, but I don't think the request-driven idiom of the DataImportHandler is not a good fit for me as I'm not kicking off full or delta imports. If that's true, what's the correct way to hook my components into SOLR's update facilities? Should I try to get a reference a configured DirectUpdateHandler? I don't know if this information helps, but I'll put it out there anyways: I'm using Spring 3 components to receive JMS events, wired up via webapp context hooks. My plan would be to add all that to my SOLR shard webapp. Best, Rich
Re: Any contribs available for Range field type?
I've tried several times to get an active account on solr-...@lucene.apache.org and the mailing list won't send me a confirmation email, and therefore won't let me post because I'm not confirmed. Could I get someone that is a member of Solr-Dev to post either my original request in this thread, or a link to this thread on the Dev mailing list? I really was hoping for more response than this to this question. This would be a terrifically useful field type to just about any solr index. Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Any-contribs-available-for-Range-field-type-tp2473601p2502203.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Are there any restrictions on what kind of how many fields you can use in Pivot Query? I get ClassCastException when I use some of my string fields, and don't when I use some other sting fields
Looks like its a bug? Is it not? Ravish On Tue, Feb 15, 2011 at 4:03 PM, Ravish Bhagdev ravish.bhag...@gmail.comwrote: When include some of the fields in my search query: SEVERE: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; at org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377) at org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329) at org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144) at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:904) at org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:121) at org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:126) at org.apache.solr.handler.component.PivotFacetHelper.process(PivotFacetHelper.java:85) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:84) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Works with some fields not with others... What could be the problem? It is hard to know with just that exception as it refers to solr's internal files...any indicators will help me debug. Thanks, Ravish
Re: Any contribs available for Range field type?
d...@lucene.apache.org Solr-dev is deprecated since Lucene and Solr converged. Can you subscribe to the above list instead? Best Erick On Tue, Feb 15, 2011 at 10:49 AM, kenf_nc ken.fos...@realestate.com wrote: I've tried several times to get an active account on solr-...@lucene.apache.org and the mailing list won't send me a confirmation email, and therefore won't let me post because I'm not confirmed. Could I get someone that is a member of Solr-Dev to post either my original request in this thread, or a link to this thread on the Dev mailing list? I really was hoping for more response than this to this question. This would be a terrifically useful field type to just about any solr index. Thanks, Ken -- View this message in context: http://lucene.472066.n3.nabble.com/Any-contribs-available-for-Range-field-type-tp2473601p2502203.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Errors when implementing VelocityResponseWriter
Hi Erik thank you for the reply I have placed all velocity jar files in my /lib directory. As explained below, I have added relevant configuration to solrconfig.xml, I am just wondering if the config instructions in the wiki are missing something? Can anyone advise on this. As you mentioned, my terminal output suggests that the VelocityResponseWriter class is not present and therefore the velocity jar is not present... however this is not the case. I have specified lib dir=./lib / in solrconfig.xml, is this enough or do I need to use an exact path. I have already tried specifying an exact path and it does not seem to work either. Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 15 February 2011 06:48 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter looks like you're missing the Velocity JAR. It needs to be in some Solr visible lib directory. With 1.4.1 you'll need to put it in solr-home/lib. In later versions, you can use the lib elements in solrconfig.xml to point to other directories. Erik On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: Hello List, I am currently trying to implement the above in Solr 1.4.1. Having moved velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my webapp /lib directory, then adding queryResponseWriter name=blah and class=blah followed by the responseHandler specifics I am shown the following terminal output. I also added lib dir=./lib / in solrconfig. Can anyone suggest what I have not included in the config that is still required? Thanks Lewis SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.response.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) at org.apache.solr.core.SolrCore.init(SolrCore.java:547) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.solr.response.VelocityResponseWriter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) ... 21 more Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html Winner: Times Higher Education’s Outstanding Support for Early Career Researchers of the Year 2010, GCU as a lead with Universities Scotland partners. http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html Email has been scanned for viruses by Altman
Re: Are there any restrictions on what kind of how many fields you can use in Pivot Query? I get ClassCastException when I use some of my string fields, and don't when I use some other sting fields
To get meaningful help, you have to post a minimum of: 1 the relevant schema definitions for the field that makes it blow up. include the fieldType and field tags. 2 the query you used, with some indication of the field that makes it blow up. 3 What version you're using 4 any changes you've made to the standard configurations. 5 whether you've recently installed a new version. It might help if you reviewed: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Tue, Feb 15, 2011 at 11:27 AM, Ravish Bhagdev ravish.bhag...@gmail.com wrote: Looks like its a bug? Is it not? Ravish On Tue, Feb 15, 2011 at 4:03 PM, Ravish Bhagdev ravish.bhag...@gmail.comwrote: When include some of the fields in my search query: SEVERE: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; at org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377) at org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329) at org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144) at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:904) at org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:121) at org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:126) at org.apache.solr.handler.component.PivotFacetHelper.process(PivotFacetHelper.java:85) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:84) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Works with some fields not with others... What could be the problem? It is hard to know with just that exception as it refers to solr's internal files...any indicators will help me debug. Thanks, Ravish
RE: Errors when implementing VelocityResponseWriter
To add to this (which stupidly, I have not mentioned previously) I am using Tomcat 7.0.8 as my servlet container. I have a sneaking suspicion that this is what is causing the problem, but as per below, I am unsure as to a solution. From: McGibbney, Lewis John [lewis.mcgibb...@gcu.ac.uk] Sent: 15 February 2011 17:04 To: solr-user@lucene.apache.org Subject: RE: Errors when implementing VelocityResponseWriter Hi Erik thank you for the reply I have placed all velocity jar files in my /lib directory. As explained below, I have added relevant configuration to solrconfig.xml, I am just wondering if the config instructions in the wiki are missing something? Can anyone advise on this. As you mentioned, my terminal output suggests that the VelocityResponseWriter class is not present and therefore the velocity jar is not present... however this is not the case. I have specified lib dir=./lib / in solrconfig.xml, is this enough or do I need to use an exact path. I have already tried specifying an exact path and it does not seem to work either. Thank you Lewis From: Erik Hatcher [erik.hatc...@gmail.com] Sent: 15 February 2011 06:48 To: solr-user@lucene.apache.org Subject: Re: Errors when implementing VelocityResponseWriter looks like you're missing the Velocity JAR. It needs to be in some Solr visible lib directory. With 1.4.1 you'll need to put it in solr-home/lib. In later versions, you can use the lib elements in solrconfig.xml to point to other directories. Erik On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote: Hello List, I am currently trying to implement the above in Solr 1.4.1. Having moved velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my webapp /lib directory, then adding queryResponseWriter name=blah and class=blah followed by the responseHandler specifics I am shown the following terminal output. I also added lib dir=./lib / in solrconfig. Can anyone suggest what I have not included in the config that is still required? Thanks Lewis SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.response.VelocityResponseWriter' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525) at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408) at org.apache.solr.core.SolrCore.init(SolrCore.java:547) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.solr.response.VelocityResponseWriter at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) ... 21 more Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the
Re: schema.xml configuration for file names?
Erick, I think you put the finger on the problem. Our XML files (we get from our suppliers) do *not* look like that. That's what a typical file looks like insert_list...resultresult outcome=PASS/resultparameter_liststring_parameter name=SN value=NOVAL /string_parameter name=RECEIVER value=000907010391 /string_parameter name=Model value=R16-500 /...string_parameter name=WorkCenterID value=PREP /string_parameter name=SiteID value=CTCA /string_parameter name=RouteID value=ADV /string_parameter name=LineID value=Line5 //parameter_listconfig enable_sfcs_comm=true enable_param_db_comm=false force_param_db_update=false driver_platform=LABVIEW mode=PROD driver_revision=2.0/config/insert_list Obviously, nothing like adddoc/doc/add By the way, querying q=*:* retrieved HTTP error 500 Null pointer exception, which leads me to believe that my index is 100% empty. What I am trying to do cannot be done, correct? I just don't want to waste anyone's time. Thanks, Alan. On Tue, Feb 15, 2011 at 6:01 AM, Erick Erickson erickerick...@gmail.comwrote: Can we see a small sample of an xml file you're posting? Because it should look something like add doc field name=stbmodelR16-500/field more fields here. /doc /add Take a look at the Solr admin page after you've indexed data to see what's actually in your index, I suspect what's in there isn't what you expect. Try querying q=*:* just for yucks to see what the documents returned look like. I suspect your index doesn't contain anything like what you think, but that's only a guess... Best Erick On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison kg6...@gmail.com wrote: Hello! We receive from our suppliers hardware manufacturing data in XML files. On a typical day, we got 25,000 files. That is why I chose to implement Solr. The file names are made of eleven fields separated by tildas like so CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML Our RD guys want to be able search each field of the file XML file names (OR operation) but they don't care to search the file contents. Ideally, they would like to do a query all files where stbmodel equal to R16-500 or result is P or filedate is 20110125...you get the idea. I defined in schema.xml each data field like so (from left to right -- sorry for the long list): field name=location type=textgen indexed=false stored=true multiValued=false/ field name=scriptid type=textgen indexed=false stored=true multiValued=false/ field name=slotid type=textgen indexed=false stored=true multiValued=false/ field name=workcenter type=textgen indexed=false stored=false multiValued=false/ field name=workcenterid type=textgen indexed=false stored=fase multiValued=false/ field name=result type=string indexed=true stored=truemultiValued=false/ field name=computerid type=textgen indexed=false stored=true multiValued=false/ field name=stbmodel type=textgen indexed=true stored=truemultiValued=false/ field name=receiver type=string indexed=true stored=truemultiValued=false/ field name=filedate type=textgen indexed=false stored=true multiValued=false/ field name=filetime type=textgen indexed=false stored=true multiValued=false/ Also, I defined as unique key the field receiver. But no results are returned by my queries. I made sure to update my index like so: java -jar apache-solr-1.4.1/example/exampledocs/post.jar *XML. I am obviously missing something. Is there a way to configure schema.xml to search for file names? I welcome your input. Al. -- AB. Sent from my Gmail account.
Re: Multicore boosting to only 1 core
No. In fact, there's no way to search over multi-cores at once in Solr at all, even before you get to your boosting question. Your different cores are entirely different Solr indexes, Solr has no built-in way to combine searches accross multiple Solr instances. [Well, sort of it can, with sharding. But sharding is unlikely to be a solution to your problem either, UNLESS you problem is that your solr index is so big you want to split it accross multiple machines for performance. That is the problem sharding is meant to solve. People trying to use it to solve other problems run into trouble.] On 2/14/2011 1:59 PM, Tanner Postert wrote: I have a multicore system and I am looking to boost results by date, but only for 1 core. Is this at all possible? Basically one of the core's content is very new, and changes all the time, and if I boost everything by date, that core's content will almost always be at the top of the results, so I only want to do the date boosting to the cores that have older content so that their more recent results get boosted over the older content.
Re: schema.xml configuration for file names?
You can't just send arbitrary XML to Solr for update, no. You need to send a Solr Update Request in XML. You can write software that transforms that arbitrary XML to a Solr update request, for simple cases it could even just be XSLT. There are also a variety of other mediator pieces that come with Solr for doing updates; you can send updates in comma-seperated-value format, or you can use Direct Import Handler to, in some not-too-complicated cases, embed the translation from your arbitrary XML to Solr documents in your Solr instance itself. But you can't just send arbitrary XML to the Solr update handler, no. No matter what method you use to send documents to solr, you're going to have to think about what you want your Solr schema to look like -- what fields of what types. And then map your data to it. In Solr, unlike in an rdbms, what you want your schema to look like has a lot to do with what kinds of queries you will want it to support, it can't just be done based on the nature of the data alone. Jonathan On 2/15/2011 12:45 PM, alan bonnemaison wrote: Erick, I think you put the finger on the problem. Our XML files (we get from our suppliers) do *not* look like that. That's what a typical file looks like insert_list...resultresult outcome=PASS/resultparameter_liststring_parameter name=SN value=NOVAL /string_parameter name=RECEIVER value=000907010391 /string_parameter name=Model value=R16-500 /...string_parameter name=WorkCenterID value=PREP /string_parameter name=SiteID value=CTCA /string_parameter name=RouteID value=ADV /string_parameter name=LineID value=Line5 //parameter_listconfig enable_sfcs_comm=true enable_param_db_comm=false force_param_db_update=false driver_platform=LABVIEW mode=PROD driver_revision=2.0/config/insert_list Obviously, nothing likeadddoc/doc/add By the way, querying q=*:* retrieved HTTP error 500 Null pointer exception, which leads me to believe that my index is 100% empty. What I am trying to do cannot be done, correct? I just don't want to waste anyone's time. Thanks, Alan. On Tue, Feb 15, 2011 at 6:01 AM, Erick Ericksonerickerick...@gmail.comwrote: Can we see a small sample of an xml file you're posting? Because it should look something like add doc field name=stbmodelR16-500/field more fields here. /doc /add Take a look at the Solr admin page after you've indexed data to see what's actually in your index, I suspect what's in there isn't what you expect. Try querying q=*:* just for yucks to see what the documents returned look like. I suspect your index doesn't contain anything like what you think, but that's only a guess... Best Erick On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaisonkg6...@gmail.com wrote: Hello! We receive from our suppliers hardware manufacturing data in XML files. On a typical day, we got 25,000 files. That is why I chose to implement Solr. The file names are made of eleven fields separated by tildas like so CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML Our RD guys want to be able search each field of the file XML file names (OR operation) but they don't care to search the file contents. Ideally, they would like to do a query all files where stbmodel equal to R16-500 or result is P or filedate is 20110125...you get the idea. I defined in schema.xml each data field like so (from left to right -- sorry for the long list): field name=location type=textgen indexed=false stored=true multiValued=false/ field name=scriptid type=textgen indexed=false stored=true multiValued=false/ field name=slotid type=textgen indexed=false stored=true multiValued=false/ field name=workcenter type=textgen indexed=false stored=false multiValued=false/ field name=workcenterid type=textgen indexed=false stored=fase multiValued=false/ field name=result type=string indexed=true stored=truemultiValued=false/ field name=computerid type=textgen indexed=false stored=true multiValued=false/ field name=stbmodel type=textgen indexed=true stored=truemultiValued=false/ field name=receiver type=string indexed=true stored=truemultiValued=false/ field name=filedate type=textgen indexed=false stored=true multiValued=false/ field name=filetime type=textgen indexed=false stored=true multiValued=false/ Also, I defined as unique key the field receiver. But no results are returned by my queries. I made sure to update my index like so: java -jar apache-solr-1.4.1/example/exampledocs/post.jar *XML. I am obviously missing something. Is there a way to configure schema.xml to search for file names? I welcome your input. Al.
Re: schema.xml configuration for file names?
Thank you for your thorough response. Things make more sense now. Back to the drawing board. Alan. On Tue, Feb 15, 2011 at 10:23 AM, Jonathan Rochkind rochk...@jhu.eduwrote: You can't just send arbitrary XML to Solr for update, no. You need to send a Solr Update Request in XML. You can write software that transforms that arbitrary XML to a Solr update request, for simple cases it could even just be XSLT. There are also a variety of other mediator pieces that come with Solr for doing updates; you can send updates in comma-seperated-value format, or you can use Direct Import Handler to, in some not-too-complicated cases, embed the translation from your arbitrary XML to Solr documents in your Solr instance itself. But you can't just send arbitrary XML to the Solr update handler, no. No matter what method you use to send documents to solr, you're going to have to think about what you want your Solr schema to look like -- what fields of what types. And then map your data to it. In Solr, unlike in an rdbms, what you want your schema to look like has a lot to do with what kinds of queries you will want it to support, it can't just be done based on the nature of the data alone. Jonathan On 2/15/2011 12:45 PM, alan bonnemaison wrote: Erick, I think you put the finger on the problem. Our XML files (we get from our suppliers) do *not* look like that. That's what a typical file looks like insert_list...resultresult outcome=PASS/resultparameter_liststring_parameter name=SN value=NOVAL /string_parameter name=RECEIVER value=000907010391 /string_parameter name=Model value=R16-500 /...string_parameter name=WorkCenterID value=PREP /string_parameter name=SiteID value=CTCA /string_parameter name=RouteID value=ADV /string_parameter name=LineID value=Line5 //parameter_listconfig enable_sfcs_comm=true enable_param_db_comm=false force_param_db_update=false driver_platform=LABVIEW mode=PROD driver_revision=2.0/config/insert_list Obviously, nothing likeadddoc/doc/add By the way, querying q=*:* retrieved HTTP error 500 Null pointer exception, which leads me to believe that my index is 100% empty. What I am trying to do cannot be done, correct? I just don't want to waste anyone's time. Thanks, Alan. On Tue, Feb 15, 2011 at 6:01 AM, Erick Ericksonerickerick...@gmail.com wrote: Can we see a small sample of an xml file you're posting? Because it should look something like add doc field name=stbmodelR16-500/field more fields here. /doc /add Take a look at the Solr admin page after you've indexed data to see what's actually in your index, I suspect what's in there isn't what you expect. Try querying q=*:* just for yucks to see what the documents returned look like. I suspect your index doesn't contain anything like what you think, but that's only a guess... Best Erick On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaisonkg6...@gmail.com wrote: Hello! We receive from our suppliers hardware manufacturing data in XML files. On a typical day, we got 25,000 files. That is why I chose to implement Solr. The file names are made of eleven fields separated by tildas like so CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML Our RD guys want to be able search each field of the file XML file names (OR operation) but they don't care to search the file contents. Ideally, they would like to do a query all files where stbmodel equal to R16-500 or result is P or filedate is 20110125...you get the idea. I defined in schema.xml each data field like so (from left to right -- sorry for the long list): field name=location type=textgen indexed=false stored=true multiValued=false/ field name=scriptid type=textgen indexed=false stored=true multiValued=false/ field name=slotid type=textgen indexed=false stored=true multiValued=false/ field name=workcenter type=textgen indexed=false stored=false multiValued=false/ field name=workcenterid type=textgen indexed=false stored=fase multiValued=false/ field name=result type=string indexed=true stored=truemultiValued=false/ field name=computerid type=textgen indexed=false stored=true multiValued=false/ field name=stbmodel type=textgen indexed=true stored=truemultiValued=false/ field name=receiver type=string indexed=true stored=truemultiValued=false/ field name=filedate type=textgen indexed=false stored=true multiValued=false/ field name=filetime type=textgen indexed=false stored=true multiValued=false/ Also, I defined as unique key the field receiver. But no results are returned by my queries. I made sure to update my index like so: java -jar apache-solr-1.4.1/example/exampledocs/post.jar
Question regarding inner entity in dataimporthandler
Hello all, I have searched the forums for the question I am about to ask, never found any concrete results. This is my case. I am defining the data config file with the document and entity tags. I define with success a basic entity mapped to my mysql database, and I then add some inner entities. The problem I have is with the one-to-one relationship I have between my document entity and its documentcategory entity. In my document table, the documentcategory foreign key is optional. Here is my mapping document entity name=document query=select DocumentID, DocumentID as documentId, CreationDate as creationDate, DocumentName as documentName, Description as description, DescriptionAbstract as descriptionAbstract, Downloads as downloads, Downloads30days as downloads30days, Downloads90days as downloads90days, PageViews as pageViews, PageViews30days as PageViews30days, PageViews90days as pageViews90days, Bookmarks as bookmarks, Bookmarks30days as bookmarks30days, Bookmarks90days as bookmarks90days, DocumentRating as documentRating, DocumentRating30days as documentRating30days, DocumentRating90days as documentRating90days, LicenseType as licenseType, BizTreeLibraryDoc as bizTreeLibraryDoc, DocFormat as docFormat, Price as price, CreatedByMemberID as memberId, DocumentCategoryID as categoryId, IsFreeDoc as isFreeDoc from document transformer=TemplateTransformer field column=id name=id template=doc${document.documentId} / field column=documentId name=docId/ field column=creationDate name=creationDate / field column=documentName name=documentName / field column=description name=description / field column=descriptionAbstract name=descriptionAbstract / field column=downloads name=downloads / field column=downloads30days name=downloads30days / field column=downloads90days name=downloads90days / field column=pageViews name=pageViews / field column=pageViews30days name=pageViews30days / field column=pageViews90days name=pageViews90days / field column=bookmarks name=bookmarks / field column=bookmarks30days name=bookmarks30days / field column=bookmarks90days name=bookmarks90days / field column=documentRating name=documentRating / field column=documentRating30days name=documentRating30days / field column=documentRating90days name=documentRating90days / field column=licenseType name=licenseType / field column=bizTreeLibraryDoc name=bizTreeLibraryDoc / field column=docFormat name=docFormat / field column=price name=price / field column=isFreeDoc name=isFreeDoc / entity name=category query=select CategoryID as id, CategoryName as categoryName, MetaTitle as categoryMetaTitle, MetaDescription as categoryMetaDescription, MetaKeywords as categoryMetakeywords from documentcategory where CategoryID = ${document.categoryId} onError=skip field column=categoryName name=categoryName/ field column=categoryMetaTitle name=categoryMetaTitle/ field column=categoryMetaDescription name=categoryMetaDescription/
RE: Question regarding inner entity in dataimporthandler
OK, I think I found some information, supposedly TemplateTransformer will return an empty string if the value of a variable is null. Some people say to use the regex transformer instead, can anyone clarify this? Thanks -Original Message- From: Greg Georges [mailto:greg.geor...@biztree.com] Sent: 15 février 2011 13:38 To: solr-user@lucene.apache.org Subject: Question regarding inner entity in dataimporthandler Hello all, I have searched the forums for the question I am about to ask, never found any concrete results. This is my case. I am defining the data config file with the document and entity tags. I define with success a basic entity mapped to my mysql database, and I then add some inner entities. The problem I have is with the one-to-one relationship I have between my document entity and its documentcategory entity. In my document table, the documentcategory foreign key is optional. Here is my mapping document entity name=document query=select DocumentID, DocumentID as documentId, CreationDate as creationDate, DocumentName as documentName, Description as description, DescriptionAbstract as descriptionAbstract, Downloads as downloads, Downloads30days as downloads30days, Downloads90days as downloads90days, PageViews as pageViews, PageViews30days as PageViews30days, PageViews90days as pageViews90days, Bookmarks as bookmarks, Bookmarks30days as bookmarks30days, Bookmarks90days as bookmarks90days, DocumentRating as documentRating, DocumentRating30days as documentRating30days, DocumentRating90days as documentRating90days, LicenseType as licenseType, BizTreeLibraryDoc as bizTreeLibraryDoc, DocFormat as docFormat, Price as price, CreatedByMemberID as memberId, DocumentCategoryID as categoryId, IsFreeDoc as isFreeDoc from document transformer=TemplateTransformer field column=id name=id template=doc${document.documentId} / field column=documentId name=docId/ field column=creationDate name=creationDate / field column=documentName name=documentName / field column=description name=description / field column=descriptionAbstract name=descriptionAbstract / field column=downloads name=downloads / field column=downloads30days name=downloads30days / field column=downloads90days name=downloads90days / field column=pageViews name=pageViews / field column=pageViews30days name=pageViews30days / field column=pageViews90days name=pageViews90days / field column=bookmarks name=bookmarks / field column=bookmarks30days name=bookmarks30days / field column=bookmarks90days name=bookmarks90days / field column=documentRating name=documentRating / field column=documentRating30days name=documentRating30days / field column=documentRating90days name=documentRating90days / field column=licenseType name=licenseType / field column=bizTreeLibraryDoc name=bizTreeLibraryDoc / field column=docFormat name=docFormat / field column=price name=price / field column=isFreeDoc name=isFreeDoc / entity name=category query=select CategoryID as id, CategoryName as categoryName, MetaTitle as categoryMetaTitle, MetaDescription as categoryMetaDescription, MetaKeywords as categoryMetakeywords from
Re: Multicore boosting to only 1 core
Could you make an additional date field, call it date_boost, that gets populated in all of the cores EXCEPT the one with the newest articles, and then boost on this field? Then when you move articles from the 'newest' core to the rest of the cores you copy over the date to the date_boost field. (I haven't used boosting before so I don't know what happens if you try to boost a field that's empty) This would boost documents in each index (locally, as desired). Keep in mind when you get your results back from a distributed shard query that the IDF is not distributed so your scores aren't reliable for sorting. -mike On Tue, Feb 15, 2011 at 1:19 PM, Jonathan Rochkind rochk...@jhu.edu wrote: No. In fact, there's no way to search over multi-cores at once in Solr at all, even before you get to your boosting question. Your different cores are entirely different Solr indexes, Solr has no built-in way to combine searches accross multiple Solr instances. [Well, sort of it can, with sharding. But sharding is unlikely to be a solution to your problem either, UNLESS you problem is that your solr index is so big you want to split it accross multiple machines for performance. That is the problem sharding is meant to solve. People trying to use it to solve other problems run into trouble.] On 2/14/2011 1:59 PM, Tanner Postert wrote: I have a multicore system and I am looking to boost results by date, but only for 1 core. Is this at all possible? Basically one of the core's content is very new, and changes all the time, and if I boost everything by date, that core's content will almost always be at the top of the results, so I only want to do the date boosting to the cores that have older content so that their more recent results get boosted over the older content.
Re: Any way to get back search query with parsed out stop words
ok, I will look at using that filter factory on my content. But I was also looking at the stop filter number so I could adjust my mm parameter based on the number of non-stopwords in the search parameter so I don't run into the dismax stopword issue. any way around that other than using a very low mm? On Tue, Feb 15, 2011 at 1:45 PM, Ahmet Arslan iori...@yahoo.com wrote: I am trying to only get back the natural searched terms so I can highlight them in the returned search results. I know that solr has built in highlighting capability, but I can't use it because some of the fields contain HTML themselves and I need to strip it all out when I display the search results. I would stick with the solr's highlighting. If you strip html codes with a solr.HTMLStripCharFilterFactory, you can highlight html fields without problem.
Re: solr.HTMLStripCharFilterFactory not working
nevermind, I think I found my answer here: http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.htmlI will add the HTML stripper to the data importer and see how that goes On Tue, Feb 15, 2011 at 3:43 PM, Tanner Postert tanner.post...@gmail.comwrote: I have several fields defined and one of the field types includes a solr.HTMLStripCharFilterFactory field in the analyzer but it doesn't appear to be affecting the field as I would expect. I have tried a simple: charFilter class=solr.HTMLStripCharFilterFactory followed by the tokenizer tokenizer class=solr.WhitespaceTokenizerFactory/ or the combined factory tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory / but neither seems to work. Returned search results from the webtitle webdescription as well as text include the original HTML characters that the title description fields have. The relevant schema: types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType fieldType name=textSpell class=solr.TextField positionIncrementGap=100 omitNorms=true analyzer type=index tokenizer class=solr.HTMLStripStandardTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.HTMLStripStandardTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ /analyzer /fieldType /types fields field name=title type=string index=true stored=true multiValued=false / field name=webtitletype=text index=true stored=true multiValued=false / copyField source=title dest=webtitle / field name=description type=string index=true stored=true multiValued=false compressed=true / field name=webdescription type=text index=true stored=true mutliValued=false compressed=true / copyField source=description dest=webdescription / field name=spell type=textSpell index=true stored=true multiValued=true / copyField source=title dest=spell / copyField source=description dest=spell / field name=texttype=text index=true stored=true multiValued=true / copyField source=title dest=text / copyField source=description dest=text / /fields
Re: solr.HTMLStripCharFilterFactory not working
I am using the data import handler and using the HTMLStripTransformer doesn't seem to be working either. I've changed webtitle and webdescription to not by copied from title and description in the schema.xml file then set them both to just but duplicates of title and description in the data importer query: document name=items entity dataSource=db name=item transformer=HTMLStripTransformer query=select title as title, title as webtitle, description as description, description as webdescription FROM ... field column=webtitle stripHTML=true / field column=webdescription stripHTML=true / /entity /document On Tue, Feb 15, 2011 at 3:49 PM, Tanner Postert tanner.post...@gmail.comwrote: nevermind, I think I found my answer here: http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.htmlI will add the HTML stripper to the data importer and see how that goes On Tue, Feb 15, 2011 at 3:43 PM, Tanner Postert tanner.post...@gmail.comwrote: I have several fields defined and one of the field types includes a solr.HTMLStripCharFilterFactory field in the analyzer but it doesn't appear to be affecting the field as I would expect. I have tried a simple: charFilter class=solr.HTMLStripCharFilterFactory followed by the tokenizer tokenizer class=solr.WhitespaceTokenizerFactory/ or the combined factory tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory / but neither seems to work. Returned search results from the webtitle webdescription as well as text include the original HTML characters that the title description fields have. The relevant schema: types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType fieldType name=textSpell class=solr.TextField positionIncrementGap=100 omitNorms=true analyzer type=index tokenizer class=solr.HTMLStripStandardTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.HTMLStripStandardTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ /analyzer /fieldType /types fields field name=title type=string index=true stored=true multiValued=false / field name=webtitletype=text index=true stored=true multiValued=false / copyField source=title dest=webtitle / field name=description type=string index=true stored=true multiValued=false compressed=true / field name=webdescription type=text index=true stored=true mutliValued=false compressed=true / copyField source=description dest=webdescription / field name=spell type=textSpell index=true stored=true multiValued=true / copyField source=title dest=spell / copyField source=description dest=spell / field name=texttype=text index=true stored=true multiValued=true / copyField source=title dest=text / copyField source=description dest=text / /fields
Passing parameters to DataImportHandler
It'd be nice to be able to pass HTTP parameters into DataImportHandler that'd be passed into the SQL as parameters, is this possible?
Reminder: Lucene Revolution 2011 Call For Papers Closing March 2
Please submit your Call For Participation (CFP) proposals for Lucene Revolution 2011 by March 2. If you have a great Solr or Lucene talk, this is a fantastic opportunity to share it with the community at the largest worldwide conference dedicated to Lucene and Solr which will take place at the San Francisco Airport Hyatt Regency May 25-26. To submit a proposal for a 45-minute presentation, complete the form at: http://www.lucidimagination.com/revolution/2011/cfp Topics of interest include: - Lucene and Solr in the Enterprise (case studies, implementation, return on investment, etc.) - Use of LucidWorks Enterprise - “How We Did It” development case studies - Lucene/Solr technology deep dives: features, how to use, etc. - Spatial/Geo/local search - Lucene and Solr in the Cloud - Scalability and performance tuning - Large Scale Search - Real Time Search (or NRT search) - Data Integration/Data Management - Lucene Solr for Mobile Applications - Associated technologies: Mahout, Nutch, NLP, etc. All accepted speakers will get complimentary conference passes. Financial assistance is available for speakers that qualify. Submissions must be received by Wednesday , March 2 , 2011 , 12 Midnight PST Registration is now open for Lucene Revolution 2011 at: http://lucenerevolution.com/register Interested in conference news? Want to be added to the conference mailing list? Is your organization interested in sponsorship opportunities? Please send an email to: i...@lucenerevolution.org Regards, Mike Michael Bohlig | Lucid Imagination Enterprise Marketing p +1 650 353 4057 x132 m+1 650 703 8383 www.lucidimagination.com
Cloning SolrInputDocument
In the process of handling a type of web service request, I need to create a series of documents for indexing. They differ by only a couple of field values, but share quite a few others. I would like to clone SolrInputDocument and adjust a couple of fields, index that, lather, rinse, repeat. However, org.apache.solr.common.SolrInputDocument (branch_3x) does not implement Cloneable, override clone() to make a deep-copy etc. Also observed by looking at the source code is the fact that SolrInputDocument keeps all fields in a LinkedHashMap, and also exposes a Map interface. So, does this sound like a workable idea? I define all my fields in a MapString, SolrInputField for the first document, and then tweak and re-use it. E.g.: CollectionSolrInputDocument docs = new ArrayListSolrInputDocument(); MapString, SolrInputField fields = ...; //Set up fields for 1st document SolrInputDocument doc = new SolrInputDocument(); doc.putAll(fields); docs.add(doc); //Update values for fields (keys) a and b in fields Map. doc = new SolrInputDocument(); doc.putAll(fields); docs.add(doc); //Update values for fields (keys) a and b in fields Map. doc = new SolrInputDocument(); doc.putAll(fields); docs.add(doc); and so forth. Then: SolrServer solrServer = getSolrServer(); solrServer.add(docs); solrServer.commit(); Map.putAll Copies all of the mappings from the specified map to this map, so each document will have its own copy of the fields. I will, or course, have to have map values of SolrInputField and instantiate those etc. Perhaps this is not worth the effort and I should be be satisfied repeating the same doc.addField() method calls. Thanks, Jeff -- Jeff Schmidt 535 Consulting j...@535consulting.com (650) 423-1068 http://www.535consulting.com
Re: how to control which hosts can access Solr?
You can not do this kind of configuration by solrsonfig ,you have to configure it with the help of your network administrator. - Thanx: Grijesh http://lucidimagination.com -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-control-which-hosts-can-access-Solr-tp2506270p2507048.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Spellchecking with some misspelled words in index source
You have to correct the misspelled terms in your content to work properly because spell checker will find the term and supposed as right term. spell checker will return suggestion when word not found in its dictionary. - Thanx: Grijesh http://lucidimagination.com -- View this message in context: http://lucene.472066.n3.nabble.com/Spellchecking-with-some-misspelled-words-in-index-source-tp2505722p2507110.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: rollback to other versions of index
Hi, Wanted to explain my situation in more detail. I have a master which never adds or deletes documents incrementally. I just run the dataimport with autocommit. Seems like I'll need to make a custom DeletionPolicy to keep more than one index around. I'm accessing indices from Solr. How do I tell solr to use a particular index? Thanks, Tri From: Michael McCandless luc...@mikemccandless.com To: solr-user@lucene.apache.org Sent: Tue, February 15, 2011 5:36:49 AM Subject: Re: rollback to other versions of index Lucene is able to do this, if you make a custom DeletionPolicy (which controls when commit points are deleted). By default Lucene only saves the most recent commit (KeepOnlyLastCommitDeletionPolicy), but if your policy keeps more around, then you can open an IndexReader or IndexWriter on any IndexCommit. Any changes (including optimize, and even opening a new IW with create=true) are safe within a commit; Lucene is fully transactional. For example, I use this for benchmarking: I save 4 commit points in a single index. First is a multi-segment index, second is the same index with 5% deletions, third is an optimized index, and fourth is the optimized index with 5% deletions. This gives me a single index w/ 4 different commit points, so I can then benchmark searching against any of those 4. Mike On Tue, Feb 15, 2011 at 4:43 AM, Jan Høydahl jan@cominvent.com wrote: Yes and no. The index grows like an onion adding new segments for each commit. There is no API to remove the newly added segments, but I guess you could hack something. The other problem is that as soon as you trigger an optimize() all history is gone as the segments are merged into one. Optimize normally happens automatically behind the scenes. You could turn off merging but that will badly hurt your performance after some time and ultimately crash your OS. Since you only need a few versions back, you COULD write your own custom mergePolicy, always preserving at least N versions. But beware that a version may be ONE document or 1 documents, depending on how you commit or if autoCommit is active. so if you go this route you also need strict control over your commits. Perhaps best option is to handle this on feeding client side, where you keep a buffer of N last docs. Then you can freely roll back or re-index as you choose, based on time, number of docs etc. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 15. feb. 2011, at 01.21, Tri Nguyen wrote: Hi, Does solr version each index build? We'd like to be able to rollback to not just a previous version but maybe a few version before the current one. Thanks, Tri
Solr the right thing for me?
Hello all, I'm searching for a possibility to: - Receive an email when a site changed/was added to a web. - Only index sites, that contain a reg exp in the content. - Receive the search results in machine readable way (RSS/SOAP/..) This should be possible to organize in sets. (set A with 40 Websites, set B with 7 websites) Does it sound possible with SOLR? Do I have to expect custom development? If so, how much? Thank you in advance Bye, Chris
Re: Solr the right thing for me?
On Wed, Feb 16, 2011 at 12:18 PM, Chris spamo...@freenet.de wrote: [...] - Receive an email when a site changed/was added to a web. - Only index sites, that contain a reg exp in the content. I think that you might be confused about what Solr does. It is a search engine, and does not crawl websites. A good possibility for you might be Nutch, which has built-in search capabilities, but also interfaces with Solr. - Receive the search results in machine readable way (RSS/SOAP/..) Solr gives you XML/JSON. This should be possible to organize in sets. (set A with 40 Websites, set B with 7 websites) Yes, this can be done with separate indexes. Does it sound possible with SOLR? Do I have to expect custom development? If so, how much? [...] Nutch and Solr should meet your needs. There will be a fair amount of learning to do at the beginning, but there should not be a need for too much of a customisation. Regards, Gora
clustering with tomcat
hi, i am using solr1.4 with apache tomcat. to enable the clustering feature i follow the link http://wiki.apache.org/solr/ClusteringComponent Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. after that which steps be will required.
Re: how to control which hosts can access Solr?
On Wed, Feb 16, 2011 at 10:52 AM, Grijesh pintu.grij...@gmail.com wrote: You can not do this kind of configuration by solrsonfig ,you have to configure it with the help of your network administrator. [...] The normal way to do this on Linux is with rules for iptables that only allow access to the Solr port for certain hosts. Regards, Gora
Re: how to control which hosts can access Solr?
thank you, so firewall then I also saw Slor authentication, maybe I should add that also. thanks, canal From: Gora Mohanty g...@mimirtech.com To: solr-user@lucene.apache.org Sent: Wed, February 16, 2011 3:10:25 PM Subject: Re: how to control which hosts can access Solr? On Wed, Feb 16, 2011 at 10:52 AM, Grijesh pintu.grij...@gmail.com wrote: You can not do this kind of configuration by solrsonfig ,you have to configure it with the help of your network administrator. [...] The normal way to do this on Linux is with rules for iptables that only allow access to the Solr port for certain hosts. Regards, Gora