Re: How to display Highlight with VelocityResponseWriter?
One trick I use a lot with VwR is simply to do $object.class in a template, hit refresh and see what type of object it is. The consult javadocs/code to see how to navigate that object. Highlighting support is something that I meant to put into the templates shipped, as it is something I've used in several custom prototypes. Here's some tricks from one of my VM_global_library.vm files laying around: #macro(field $f)$!{esc.html($doc.getFirstValue($f))}#end #macro(field_with_highlighting $f) #if($response.highlighting.get($doc.getFieldValue('id')).get($f)) #foreach($fragment in $response.highlighting.get($doc.getFieldValue('id')).get($f)) ...$fragment... #end #else #field($f) #end #end Then in your templates you can simply say #field_with_highlighting('title') (within a context where $doc is defined) and if there is highlighting it will show ...fragments... or the entire field value. Or #field('title') for simple unhighlighted single-valued fields. Note that the above also assumes that 'id' is the uniqueKey, though when I eventually get this incorporated into the main templates (patches welcome!) in contrib/velocity we can make it use the schema metadata to get the designated field dynamically. Again, don't forget about the $object.class trick. $response.class, then see what that returns, then $response.highlighting.class, and so on until the object graph pieces you want are presented. Also, simply $object will dump the toString() of the object, which gives a lot of insight for many objects. Erik On Jan 13, 2010, at 7:28 PM, Sascha Szott wrote: Hi Qiuyan, Thanks a lot. It works now. When i added the line #set($hl = $response.highlighting) i got the highlighting. But i wonder if there's any document that describes the usage of that. I mean i didn't know the name of those methods. Actually i just managed to guess it. Solritas (aka VelocityResponseWriter) binds a number of objects into a so called VelocityContext (consult [1] for a complete list). You can think of a map that allows you to access objects by symbolic names, e.g., an instance of QueryResponse is stored under response (that's why you write $response in your template). Since $response is an instance of QueryResponse you can call all methods on it the API [2] provides. Furthermore, Velocity incorporates a JavaBean-like introspection mechanism that lets you write $response.highlighting instead of $response.getHighlighting() (only a bit of syntactic sugar). -Sascha [1] http://wiki.apache.org/solr/VelocityResponseWriter#line-93 [2] http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/response/QueryResponse.html Quoting Sascha Szott sz...@zib.de: Qiuyan, with highlight can also be displayed in the web gui. I've added bool name=hltrue/bool into the standard responseHandler and it already works, i.e without velocity. But the same line doesn't take effect in itas. Should i configure anything else? Thanks in advance. First of all, just a few notes on the /itas request handler in your solrconfig.xml: 1. The entry arr name=components strhighlight/str /arr is obsolete, since the highlighting component is a default search component [1]. 2. Note that since you didn't specify a value for hl.fl highlighting will only affect the fields listed inside of qf. 3. Why did you override the default value of hl.fragmenter? In most cases the default fragmenting algorithm (gap) works fine - and maybe in yours as well? To make sure all your hl related settings are correct, can you post an xml output (change the wt parameter to xml) for a search with highlighted results. And finally, can you post the vtl code snippet that should produce the highlighted output. -Sascha [1] http://wiki.apache.org/solr/SearchComponent
Re: Need help Migrating to Solr
I've done a fair number of migrations, but it's kind of hard to give generic advice on it. Specific questions as you dig in would be best. I'd probably, at least, just start with a simple schema that models most of your data and get Solr up and ingesting it. Then run some queries against it in your browser (no need for writing client side code yet) then go from there. -Grant On Jan 12, 2010, at 11:42 PM, Abin Mathew wrote: Hi I am new to the solr technology. We have been using lucene for handling searching in our web application www.toostep.com which is a knowledge sharing platform developed in java using Spring MVC architecture and iBatis as the persistance framework. Now that the application is getting very complex we have decided to implement Solr technology over lucene. Anyone having expertise in this area please give me some guidelines on where to start off and how to form the schema for Solr. Thanks and Regards Abin Mathew -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Need help Migrating to Solr
Hi, since we did some kind of migration in a similar situation in the recent past, I might add some (hopefully helpful) remarks: If You use a Lucene-based application right now, You might already have an idea of which fields You want to store in Solr. Since You already do analyzing of fields, it should be easy to identify the necessary analyzers and filter-chains to be configured in the fiel-type part of the schema. Once you got the basic definition of the schema, You can start loading content into Solr. You can inspect the results using the admin web gui. I've found the ad hoc query interface and the analysis facility very helpful to get an idea of the inner workings. Of course that is only the very beginning. You should realize that Solr offers a very powerful mechanism to configure the way how queries are handled (using query handlers ...). The book Solr 1.4 Enterprise Search Server is a very good first step to understanding what You can do with Solr (refer to Solr's home page for the complete citation). Sven --On Thursday, January 14, 2010 08:38:12 AM -0500 Grant Ingersoll gsing...@apache.org wrote: I've done a fair number of migrations, but it's kind of hard to give generic advice on it. Specific questions as you dig in would be best. I'd probably, at least, just start with a simple schema that models most of your data and get Solr up and ingesting it. Then run some queries against it in your browser (no need for writing client side code yet) then go from there. -Grant On Jan 12, 2010, at 11:42 PM, Abin Mathew wrote: Hi I am new to the solr technology. We have been using lucene for handling searching in our web application www.toostep.com which is a knowledge sharing platform developed in java using Spring MVC architecture and iBatis as the persistance framework. Now that the application is getting very complex we have decided to implement Solr technology over lucene. Anyone having expertise in this area please give me some guidelines on where to start off and how to form the schema for Solr. Thanks and Regards Abin Mathew -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search -- kippdata informationstechnologie GmbH Sven Maurmann Tel: 0228 98549 -12 Bornheimer Str. 33a Fax: 0228 98549 -50 D-53111 Bonnsven.maurm...@kippdata.de HRB 8018 Amtsgericht Bonn / USt.-IdNr. DE 196 457 417 Geschäftsführer: Dr. Thomas Höfer, Rainer Jung, Sven Maurmann
Re: Need help with highlighting (detailed problem with code samples)
That's the expected result and I'm pretty happy with it. But with this field : field name=variableEltDDIXML required=true type=string multiValued=false indexed=true stored=true compressed=false omitNorms=false termVectors=false termPositions=false termOffsets=false / when I query the solr server this way : /select/?q=electionversion=2.2start=0rows=10indent=onhl=onhl.fl=variableEltDDIXMLhl.fragsize=0 I got results like this : lst name=highlighting lst name=0ad4d4fe-cff8-43c8-b5d2-cf86c71b044c/ ... It is okey to query one field, and request highlight from another field. But to get highlight, first you need a match. I see that type of variableEltDDIXML is string which is not tokenized at all. /select/?q=variableEltDDIXML:electionversion=2.2start=0rows=10indent=onhl=onhl.fl=variableEltDDIXMLhl.fragsize=0 Does the query above return documents?
Contributors - Solr in Action Case Studies
Hello, We are working on Solr in Action [1]. One of the well received chapters from LIA #1[2] was the Case Studies chapter, where external contributors described how they used Lucene. We are getting good feedback about this chapter from LIA #2 reviewers, too. Solr in Action also has a Case Studies chapter, and we are starting to look for contributors. If you are using Solr in some clever, interesting, or unusual way and are willing to share this information, please get in touch. 5 to max 10 pages (soft limits) per study is what we are hoping for. Feel free to respond on the list or reply to me directly. [1] http://www.manning.com/catalog/undercontract.html [2] http://www.manning.com/hatcher2/ and http://www.manning.com/hatcher3/ Thanks, Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
Re: Queries of type field:value not functioning
Siddhant, Check the enhanced dismax patch in JIRA if you need fielded queries to work with dismax. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: Siddhant Goel siddhantg...@gmail.com To: solr-user@lucene.apache.org Sent: Wed, January 13, 2010 11:09:12 PM Subject: Re: Queries of type field:value not functioning Hi, Thanks for the responses. q.alt did the job. Turns out that the dismax query parser was at fault, and wasn't able to handle queries of the type *:*. Putting the query in q.alt, or adding a defType=lucene (as pointed out to me on the irc channel) worked. Thanks, -- - Siddhant
Re: Solr 1.4 Field collapsing - What are the steps for applying the SOLR-236 patch?
Do you also suggest applying that latest patch to the Solr 1.4 GA version of the source code (branch-1.4)? I'm a little worried about it breaking Solrj due to the merge issues encountered (see below). Would I need to stop using Solrj as my client, and start using staight HTTP requests to take advantage of the field-collapsing functionality? I did successfully build and test branch-1.4 with the field-collapse-5.patch (patch from 12-08-2009 9:43 PM) but also see dependency issues in my IDE. -Kelly MERGE ISSUES (branch-1.4 and latest SOLR-236.patch)... src/java/org/apache/solr/handler/component/QueryComponent.java The file line // we already have the sort values and the patchline ArrayListString ids = new ArrayListString(shardDocs.size()); do not match! src/solrj/org/apache/solr/client/solrj/SolrQuery.java The file line import org.apache.solr.common.params.TermsParams; and the patchline do not match! src/solrj/org/apache/solr/client/solrj/response/QueryResponse.java The file line private NamedList _termsInfo = null; and the patchline do not match! Martijn v Groningen wrote: I wouldn't use the patches of the sub issues right now as they are under development right now (the are currently a POC). I also think that the latest patch in SOLR-236 is currently the best option. There are some memory related problems with the patch that have to do with caching. The fieldCollapse cache requires a lot of memory (best is not to use it right now). The filterCache also becomes quite large as well. Depending on the size your corpus you would need to increase your heap size and play around with that. Martijn 2010/1/12 Joe Calderon calderon@gmail.com: it seems to be in flux right now as the solr developers slowly make improvements and ingest the various pieces into the solr trunk, i think your best bet might be to use the 12/24 patch and fix any errors where it doesnt apply cleanly im using solr trunk r892336 with the 12/24 patch --joe On 01/11/2010 08:48 PM, Kelly Taylor wrote: Hi, Is there a step-by-step for applying the patch for SOLR-236 to enable field collapsing in Solr 1.4? Thanks, Kelly -- Met vriendelijke groet, Martijn van Groningen -- View this message in context: http://old.nabble.com/Re%3A-Solr-1.4-Field-collapsing---What-are-the-steps-for-applying-the-SOLR-236-patch--tp27122700p27166881.html Sent from the Solr - User mailing list archive at Nabble.com.
Unexpected boolean query behavior
Here is my query: (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt* AND anonymous) OR (virt* AND analytic*) AND owned:true It can be broken down to: (A) OR (B) OR (C) OR (D) AND E A, B, C and D are themselves AND boolean clauses. The E clause at the end is not behaving the way I would expect. No matter how I order the A,B,C and D clauses, it always returns the equivalent of ((D) AND E). When I add additional parentheses it behaves the way I expect. Like: ((A) OR (B) OR (C) OR (D)) AND E or (A) OR (B) OR (C) OR ((D) AND E) Can anyone explain why it behaves the way it does without the parentheses? Is there something I am missing in the way it processes boolean clauses? Thanks, Mark -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unexpected boolean query behavior
Mark, Does it help if you rewrite your query using +/- syntax (required, prohibited), or nothing for should? Because that's what happens under the hood (terms are required, prohibited, or should occur). Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: markwaddle m...@markwaddle.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 2:39:21 PM Subject: Unexpected boolean query behavior Here is my query: (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt* AND anonymous) OR (virt* AND analytic*) AND owned:true It can be broken down to: (A) OR (B) OR (C) OR (D) AND E A, B, C and D are themselves AND boolean clauses. The E clause at the end is not behaving the way I would expect. No matter how I order the A,B,C and D clauses, it always returns the equivalent of ((D) AND E). When I add additional parentheses it behaves the way I expect. Like: ((A) OR (B) OR (C) OR (D)) AND E or (A) OR (B) OR (C) OR ((D) AND E) Can anyone explain why it behaves the way it does without the parentheses? Is there something I am missing in the way it processes boolean clauses? Thanks, Mark -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unexpected boolean query behavior
That is a reasonable question. The problem here is that my users have already created numerous queries just like this one, using ANDs and ORs. My users are very technical and they have been using the results of these queries for months now to perform analysis that drives business decisions. I need an explanation for why this is happening so I can not only train them on how to use it more effectively, but also to restore their trust in the search application. Does anyone understand this behavior? Or can you recommend a place for me to look? Otis Gospodnetic wrote: Mark, Does it help if you rewrite your query using +/- syntax (required, prohibited), or nothing for should? Because that's what happens under the hood (terms are required, prohibited, or should occur). Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: markwaddle m...@markwaddle.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 2:39:21 PM Subject: Unexpected boolean query behavior Here is my query: (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt* AND anonymous) OR (virt* AND analytic*) AND owned:true It can be broken down to: (A) OR (B) OR (C) OR (D) AND E A, B, C and D are themselves AND boolean clauses. The E clause at the end is not behaving the way I would expect. No matter how I order the A,B,C and D clauses, it always returns the equivalent of ((D) AND E). When I add additional parentheses it behaves the way I expect. Like: ((A) OR (B) OR (C) OR (D)) AND E or (A) OR (B) OR (C) OR ((D) AND E) Can anyone explain why it behaves the way it does without the parentheses? Is there something I am missing in the way it processes boolean clauses? Thanks, Mark -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27167750.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unexpected boolean query behavior
HI Mark, Does this help? http://wiki.apache.org/lucene-java/BooleanQuerySyntax Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: markwaddle m...@markwaddle.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 3:38:34 PM Subject: Re: Unexpected boolean query behavior That is a reasonable question. The problem here is that my users have already created numerous queries just like this one, using ANDs and ORs. My users are very technical and they have been using the results of these queries for months now to perform analysis that drives business decisions. I need an explanation for why this is happening so I can not only train them on how to use it more effectively, but also to restore their trust in the search application. Does anyone understand this behavior? Or can you recommend a place for me to look? Otis Gospodnetic wrote: Mark, Does it help if you rewrite your query using +/- syntax (required, prohibited), or nothing for should? Because that's what happens under the hood (terms are required, prohibited, or should occur). Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: markwaddle To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 2:39:21 PM Subject: Unexpected boolean query behavior Here is my query: (virt* AND machine fingerprinting) OR (virt* AND encryption) OR (virt* AND anonymous) OR (virt* AND analytic*) AND owned:true It can be broken down to: (A) OR (B) OR (C) OR (D) AND E A, B, C and D are themselves AND boolean clauses. The E clause at the end is not behaving the way I would expect. No matter how I order the A,B,C and D clauses, it always returns the equivalent of ((D) AND E). When I add additional parentheses it behaves the way I expect. Like: ((A) OR (B) OR (C) OR (D)) AND E or (A) OR (B) OR (C) OR ((D) AND E) Can anyone explain why it behaves the way it does without the parentheses? Is there something I am missing in the way it processes boolean clauses? Thanks, Mark -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27166967.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27167750.html Sent from the Solr - User mailing list archive at Nabble.com.
Dynamically change config file name in DataImportHandler
I have 2 data import files, and I'd like to be able to switch between without renaming either file, and without changing solrconfig.xml. Does the DataImportHandler support that? I tried passing a 'config' parameter with the 'reload-config' command, but that didn't work. Thanks, Wojtek -- View this message in context: http://old.nabble.com/Dynamically-change-config-file-name-in-DataImportHandler-tp27168748p27168748.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dynamically change config file name in DataImportHandler
I thought of another way: have two data import request handlers configured in solrconfig.xml, one for each file. wojtekpia wrote: I have 2 data import files, and I'd like to be able to switch between without renaming either file, and without changing solrconfig.xml. Does the DataImportHandler support that? I tried passing a 'config' parameter with the 'reload-config' command, but that didn't work. Thanks, Wojtek -- View this message in context: http://old.nabble.com/Dynamically-change-config-file-name-in-DataImportHandler-tp27168748p27168868.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Drupal module problem
Hello, System : Debian 5.0 Java , tomcat solr installed from the repositories. Java version 1.6_12 , tomcat 5.5 and solr 1.2.0 . I am trying to use the schema.xml and the solrconfig.xml from the Drupal module, but they fail to work. The error I am getting is : Error loading class 'solr.TrieIntField' . How can I fix this ? Thank you ! -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169365.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Drupal module problem
Hi, Solr 1.2.0 didn't have TrieIntField. Use the latest Solr - Solr 1.4.0 Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove thereall...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:43:23 PM Subject: Solr Drupal module problem Hello, System : Debian 5.0 Java , tomcat solr installed from the repositories. Java version 1.6_12 , tomcat 5.5 and solr 1.2.0 . I am trying to use the schema.xml and the solrconfig.xml from the Drupal module, but they fail to work. The error I am getting is : Error loading class 'solr.TrieIntField' . How can I fix this ? Thank you ! -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169365.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Need deployment strategy
On Wed, Jan 13, 2010 at 05:38:33PM -0500, Paul Rosen wrote: Hi all, The way the indexing works on our system is as follows: We have a separate staging server with a copy of our web app. The clients will index a number of documents in a batch on the staging server (this happens about once a week), then they play with the results on the staging server for a day until satisfied. Only then do they give the ok to deploy. What I've been doing is, when they want to deploy, I do the following: 1) merge and optimize the index on the staging server, 2) copy it to the production server, 3) stop solr on production, 4) copy the new index on top of the old one, 5) start solr on production. This works, but has the following disadvantages: 1) The index is getting bigger, so it takes longer to zip it and transfer it. If you are doing the optimize every time before submitting to production, you will need to transfer the entire index each time anyway. To only transfer some of them you would need to NOT optimize and then use one of the replication strategies (rsync or Java) to only replicate the deltas. 2) The user is only added a few records, yet we copy over all of them. If a bug happens that causes an unrelated document to get deleted or replaced on staging, we wouldn't notice, and we'd propagate the problem to the server. I'd sleep better if I were only moving the records that were new or changed and leaving the records that already work in place. 3) solr is down on production for about 5 minutes, so users during that time are getting errors. I was looking for some kind of replication strategy where I can run a task on the production server to tell it to merge a core from the staging server. Is that possible? I can open up port 8983 on the staging server only to the production server, but then what do I do on production to get the core? Have you considered using MultiCore approach and some of the commands from CoreAdmin[1] and SolrReplication[2]? Start out with multicore enabled on the production server, and have the production core running with the name 'prod' or something like that. On the staging server, maybe have it in multicore or not. Then your deployment procedure would be: 1) On the production server use the CREATE admin command to create a new core 'deploy_MMDD' with configuration from the 'prod' core. The configuration of this core should have replication enabled but with no poll interval so replication only happens on demand. 2) Trigger a replication from 'staging' server to the 'deploy_MMDD' core using the replication handler. 3) use the ALIAS core command to add the name 'staging' to the 'deploy_MMDD' core 4) use the SWAP core command to swap the 'staging' and 'prod' cores and make sure it all works. If it doesn't work use SWAP to swap them back. In the end, you have physical cores with the names 'deploy_MMDD', or something else appropriate for your environment, and those would be the instanceDir's and such on disk. Then you have logical core aliases of 'staging' and 'production' etc. Sort of like symlinks on the file system. I have not done a deployment like this yet, just thought about it a few times. And I have not tested this out to see what, if any, complications there are. enjoy, -jeremy [1] - http://wiki.apache.org/solr/CoreAdmin [2] - http://wiki.apache.org/solr/SolrReplication -- Jeremy Hinegardner jer...@hinegardner.org
Re: Solr Drupal module problem
Hello, Thanks for the answer. Unfortunately, in the Debian repositories, even in testing, latest Solr version is 1.3.0 . Can I use that for the Drupal module to work ? Thank you. Otis Gospodnetic wrote: Hi, Solr 1.2.0 didn't have TrieIntField. Use the latest Solr - Solr 1.4.0 Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove thereall...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:43:23 PM Subject: Solr Drupal module problem Hello, System : Debian 5.0 Java , tomcat solr installed from the repositories. Java version 1.6_12 , tomcat 5.5 and solr 1.2.0 . I am trying to use the schema.xml and the solrconfig.xml from the Drupal module, but they fail to work. The error I am getting is : Error loading class 'solr.TrieIntField' . How can I fix this ? Thank you ! -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169365.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169511.html Sent from the Solr - User mailing list archive at Nabble.com.
EmbeddedSolrServer and BinaryRequestWriter
I'm trying to reduce memory usage when indexing, and I see that using the binary format may be a good way to do this. Unfortunately I can't see a way to do this using the EmbeddedSolrServer since only the CommonsHttpSolrServer has a setRequestWriter method. If I'm running out of memory constructing XML request documents, does that mean I just have to switch away from the EmbeddedSolrServer? I understand I can stream requests if I'm just indexing files already on disk, but I'm constructing them on the fly, and I run out of memory constructing the XML document to submit to solr, not in actual indexing, so it seems writing the document to disk would run into the same problems. thanks, Phil
Re: Solr Drupal module problem
You may want to ask on Drupal's mailing lists. I hear about Drupal and Solr constantly, I can't imagine them not having Solr 1.4 support, esp. if you say their configs contain referenes to things that are in Solr 1.4.0. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove thereall...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:54:55 PM Subject: Re: Solr Drupal module problem Hello, Thanks for the answer. Unfortunately, in the Debian repositories, even in testing, latest Solr version is 1.3.0 . Can I use that for the Drupal module to work ? I highly prefer to use the Debian repositories instead the source code. Thank you. Otis Gospodnetic wrote: Hi, Solr 1.2.0 didn't have TrieIntField. Use the latest Solr - Solr 1.4.0 Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:43:23 PM Subject: Solr Drupal module problem Hello, System : Debian 5.0 Java , tomcat solr installed from the repositories. Java version 1.6_12 , tomcat 5.5 and solr 1.2.0 . I am trying to use the schema.xml and the solrconfig.xml from the Drupal module, but they fail to work. The error I am getting is : Error loading class 'solr.TrieIntField' . How can I fix this ? Thank you ! -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169365.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169511.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: EmbeddedSolrServer and BinaryRequestWriter
Hi, Running out of memory because of XML document when indexing documents sounds very weird/suspicious. Are you running out of memory on the server side? Are you indexing super large batches? How big is your JVM heap? How big is your ramBufferSizeMB? Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: Phil Hagelberg p...@hagelb.org To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:57:31 PM Subject: EmbeddedSolrServer and BinaryRequestWriter I'm trying to reduce memory usage when indexing, and I see that using the binary format may be a good way to do this. Unfortunately I can't see a way to do this using the EmbeddedSolrServer since only the CommonsHttpSolrServer has a setRequestWriter method. If I'm running out of memory constructing XML request documents, does that mean I just have to switch away from the EmbeddedSolrServer? I understand I can stream requests if I'm just indexing files already on disk, but I'm constructing them on the fly, and I run out of memory constructing the XML document to submit to solr, not in actual indexing, so it seems writing the document to disk would run into the same problems. thanks, Phil
Re: Solr Drupal module problem
Hi, The Drupal Solr Module will work with both Solr 1.3 and 1.4 I currently have client installations using both these versions with Drupal (verison 5 and 6 ) Regards, Dave On 14 Jan 2010, at 23:08, Otis Gospodnetic wrote: You may want to ask on Drupal's mailing lists. I hear about Drupal and Solr constantly, I can't imagine them not having Solr 1.4 support, esp. if you say their configs contain referenes to things that are in Solr 1.4.0. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove thereall...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:54:55 PM Subject: Re: Solr Drupal module problem Hello, Thanks for the answer. Unfortunately, in the Debian repositories, even in testing, latest Solr version is 1.3.0 . Can I use that for the Drupal module to work ? I highly prefer to use the Debian repositories instead the source code. Thank you. Otis Gospodnetic wrote: Hi, Solr 1.2.0 didn't have TrieIntField. Use the latest Solr - Solr 1.4.0 Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:43:23 PM Subject: Solr Drupal module problem Hello, System : Debian 5.0 Java , tomcat solr installed from the repositories. Java version 1.6_12 , tomcat 5.5 and solr 1.2.0 . I am trying to use the schema.xml and the solrconfig.xml from the Drupal module, but they fail to work. The error I am getting is : Error loading class 'solr.TrieIntField' . How can I fix this ? Thank you ! -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169365.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169511.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Unexpected boolean query behavior
That explains my exact problem, thank you! May I ask how you found that wiki posting? Otis Gospodnetic wrote: HI Mark, Does this help? http://wiki.apache.org/lucene-java/BooleanQuerySyntax Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27170172.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Drupal module problem
The situation is apparently that the latest Drupal requires Solr 1.4 but Debian does not have a Solr 1.4 product. Can you use an earlier Drupal Solr package? If you want to backport the current Drupal Solr package to Solr 1.3, changing the Trie fields to 'sint' etc. might work. You don't have to used the source code itself to use the 1.4 solr. There is an official release with binaries. On Thu, Jan 14, 2010 at 3:21 PM, David Stuart david.stu...@progressivealliance.co.uk wrote: Hi, The Drupal Solr Module will work with both Solr 1.3 and 1.4 I currently have client installations using both these versions with Drupal (verison 5 and 6 ) Regards, Dave On 14 Jan 2010, at 23:08, Otis Gospodnetic wrote: You may want to ask on Drupal's mailing lists. I hear about Drupal and Solr constantly, I can't imagine them not having Solr 1.4 support, esp. if you say their configs contain referenes to things that are in Solr 1.4.0. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove thereall...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:54:55 PM Subject: Re: Solr Drupal module problem Hello, Thanks for the answer. Unfortunately, in the Debian repositories, even in testing, latest Solr version is 1.3.0 . Can I use that for the Drupal module to work ? I highly prefer to use the Debian repositories instead the source code. Thank you. Otis Gospodnetic wrote: Hi, Solr 1.2.0 didn't have TrieIntField. Use the latest Solr - Solr 1.4.0 Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: reallove To: solr-user@lucene.apache.org Sent: Thu, January 14, 2010 5:43:23 PM Subject: Solr Drupal module problem Hello, System : Debian 5.0 Java , tomcat solr installed from the repositories. Java version 1.6_12 , tomcat 5.5 and solr 1.2.0 . I am trying to use the schema.xml and the solrconfig.xml from the Drupal module, but they fail to work. The error I am getting is : Error loading class 'solr.TrieIntField' . How can I fix this ? Thank you ! -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169365.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/Solr-Drupal-module-problem-tp27169365p27169511.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Unexpected boolean query behavior
Try this: http://www.lucidimagination.com/search/?q=boolean+query On Thu, Jan 14, 2010 at 3:45 PM, markwaddle m...@markwaddle.com wrote: That explains my exact problem, thank you! May I ask how you found that wiki posting? Otis Gospodnetic wrote: HI Mark, Does this help? http://wiki.apache.org/lucene-java/BooleanQuerySyntax Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch -- View this message in context: http://old.nabble.com/Unexpected-boolean-query-behavior-tp27166967p27170172.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
[1.3] help with update timeout issue?
Hi, folks, I am using Solr 1.3 pretty successfully, but am running into an issue that hits once in a long while. I'm still using 1.3 since I have some custom code I will have to port forward to 1.4. My basic setup is that I have data sources continually pushing data into Solr, around 20K adds per day. The index is currently around 100G, stored on local disk on a fast linux server. I'm trying to make new docs searchable as quickly as possible, so I currently have autocommit set to 15s. I originally had 3s but that seems to be a little too unstable. I never optimize the index since optimize will lock things up solid for 2 hours, dropping docs until the optimize completes. I'm using the default segment merging settings. Every once in a while I'm getting a socket timeout when trying to add a document. I traced it to a 20s timeout and then found the corresponding point in the Solr log. Jan 13, 2010 2:59:15 PM org.apache.solr.core.SolrCore execute INFO: [tales] webapp=/solr path=/update params={} status=0 QTime=2 Jan 13, 2010 2:59:15 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true) Jan 13, 2010 2:59:56 PM org.apache.solr.search.SolrIndexSearcher init INFO: Opening searc...@26e926e9 main Jan 13, 2010 2:59:56 PM org.apache.solr.update.DirectUpdateHandler2 commit INFO: end_commit_flush Solr locked up for 41 seconds here while doing some of the commit work. So, I have a few questions. Is this related to GC? Does Solr always lock up when merging segments and I just have to live with losing the doc I want to add? Is there a timeout that would guarantee me a write success? Should I just retry in this situation? If so, how do I distinguish between this and Solr just being down? I already have had issues in the past with too many files open, so increasing the merge factor isn't an option. On a related note, I had previously asked about optimizing and was told that segment merging would take care of cleaning up deleted docs. However, I have the following stats for my index: numDocs : 2791091 maxDoc : 4811416 My understanding is that numDocs is the docs being searched and maxDoc is the number of docs including ones that will disappear after optimization. How do I get this cleanup without using optimize, since it locks up Solr for multiple hours. I'm deleting old docs daily as well. Thanks for all the help, Jerry