Re: includes in solrconfig.xml
Thanks Erik, I didn't know about that. I'll give it a shot! -Jacob Erik Hatcher wrote: Well, let's not forget about XML's entity reference includes. It's not the prettiest thing, but you can do the sort of thing mentioned here: http://www.xml.com/pub/a/2001/03/14/trxml10.html Erik On Aug 9, 2008, at 11:16 PM, Otis Gospodnetic wrote: No, not possible. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jacob Singh [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, August 8, 2008 11:04:02 PM Subject: includes in solrconfig.xml Hello, Is it possible to include an external xml file from within solrconfig.xml? Or even better, to scan a directory ala conf.d in apache? Thanks, jacob
Re: Still no results after removing from stopwords
On Sun, 10 Aug 2008 19:58:24 -0700 (PDT) SoupErman [EMAIL PROTECTED] wrote: I needed to run a search with a query containing the word not, so I removed not from the stopwords.txt file. Which seemed to work, at least as far as parsing the query. It was now successfully searching for that keyword, as noted in the query debugger. However it isn't returning any results where not is in the query, which suggests not hasn't been indexed. However looking at the listing for a particular item, not is listed as one of the keywords, so it should be finding it? Hi Michael, did you reindex your documents after 1) changing your settings and 2) restarting SOLR (to allow your settings to come into effect)? B _ {Beto|Norberto|Numard} Meijome Real Programmers don't comment their code. If it was hard to write, it should be hard to understand and even harder to modify. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Can't Delete Record
Hi: I am tying to delete the index by both deleteid and by using query. But when i searched the record again it again shows me the XML of the deleted record. i also sends coomit tag as well. Why record is not deleting??? please help me urgent. Regards, Ali Vajahat Lahore Pakistan -- View this message in context: http://www.nabble.com/Can%27t-Delete-Record-tp18926195p18926195.html Sent from the Solr - User mailing list archive at Nabble.com.
External Application (JIVE) : integration
Hi I am beginner in Solr. Question: We have two types of search in site. a) First one is : Product search . Since all data is within database (our control environment, this type of search can be implemented using Solr (based on Lucene index) b) Second one is: External application / Third party search--Jive search (Community search) For second type of search web services are exposed. We can pass search query to external application and retrieve results from third party. Now results from both searches needs to be combined and shown to user For e.g. If search query contains Canon then site search will give site results (suppose 90 search results) and Jive search gives 30 items Now on screen we have to shown combined results. Results should be sorted by relevance . Search results shown on screen can contain --first result record from site search , second result record from jive Third from Jive and next one from site search . It entirely depend on relevance to search query strin Is it possible to dynamically add third party results to search results and call function to rearrange search results ? ~Vikrant -- View this message in context: http://www.nabble.com/External-Application-%28JIVE%29-%3A-integration-tp18926597p18926597.html Sent from the Solr - User mailing list archive at Nabble.com.
Newbie question about memory allocation between solr and OS
Sorry for the newbie question. When running solr under tomcat I notice that the amount of memory tomcat uses increases over time until it reaches the maximum limit set (with the Xms and Xmx switches) for the jvm. Is it better to allocate give all available physical memory to the jvm, or to allocate enough so that solr doesn't run out of memory and let the OS use the rest for disk buffers? That is, will lucene take good advantage if given extra memory, or does the extra memory end up being used for data structures that are no longer in use but haven't been garbage-collected by the jvm yet? Thank you, --dallan
Re: Newbie question about memory allocation between solr and OS
On Mon, Aug 11, 2008 at 10:52 AM, Dallan Quass [EMAIL PROTECTED] wrote: Sorry for the newbie question. When running solr under tomcat I notice that the amount of memory tomcat uses increases over time until it reaches the maximum limit set (with the Xms and Xmx switches) for the jvm. Is it better to allocate give all available physical memory to the jvm, or to allocate enough so that solr doesn't run out of memory and let the OS use the rest for disk buffers? The latter... let the OS have as much as you can for disk buffers. -Yonik
Re: Can't Delete Record
Hi Ali, We can help you more if you can give us the following details. What is the type of the id as defined in schema.xml? What is the query you are using to delete? Does that same query show results if you search through the admin? Are there any exceptions in the logs? On Mon, Aug 11, 2008 at 7:18 PM, Vj Ali [EMAIL PROTECTED] wrote: Hi: I am tying to delete the index by both deleteid and by using query. But when i searched the record again it again shows me the XML of the deleted record. i also sends coomit tag as well. Why record is not deleting??? please help me urgent. Regards, Ali Vajahat Lahore Pakistan -- View this message in context: http://www.nabble.com/Can%27t-Delete-Record-tp18926195p18926195.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
RE: Newbie question about memory allocation between solr and OS
Thanks Yonik! In case anyone monitoring this list isn't sold already on solr, my use of solr is pretty non-standard -- I've written nearly a dozen plugins to customize it for my particular needs. Yet I've been able to do everything I need using plugins and without modifying the core code. It works like a charm. --dallan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Monday, August 11, 2008 10:15 AM To: solr-user@lucene.apache.org Subject: Re: Newbie question about memory allocation between solr and OS On Mon, Aug 11, 2008 at 10:52 AM, Dallan Quass [EMAIL PROTECTED] wrote: Sorry for the newbie question. When running solr under tomcat I notice that the amount of memory tomcat uses increases over time until it reaches the maximum limit set (with the Xms and Xmx switches) for the jvm. Is it better to allocate give all available physical memory to the jvm, or to allocate enough so that solr doesn't run out of memory and let the OS use the rest for disk buffers? The latter... let the OS have as much as you can for disk buffers. -Yonik
Re: Newbie question about memory allocation between solr and OS
Dallan, perhaps you can share some of your experiences on this thread: http://markmail.org/message/ksdnbkdt72ayomv3 Thanks! On Mon, Aug 11, 2008 at 9:35 PM, Dallan Quass [EMAIL PROTECTED] wrote: Thanks Yonik! In case anyone monitoring this list isn't sold already on solr, my use of solr is pretty non-standard -- I've written nearly a dozen plugins to customize it for my particular needs. Yet I've been able to do everything I need using plugins and without modifying the core code. It works like a charm. --dallan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Monday, August 11, 2008 10:15 AM To: solr-user@lucene.apache.org Subject: Re: Newbie question about memory allocation between solr and OS On Mon, Aug 11, 2008 at 10:52 AM, Dallan Quass [EMAIL PROTECTED] wrote: Sorry for the newbie question. When running solr under tomcat I notice that the amount of memory tomcat uses increases over time until it reaches the maximum limit set (with the Xms and Xmx switches) for the jvm. Is it better to allocate give all available physical memory to the jvm, or to allocate enough so that solr doesn't run out of memory and let the OS use the rest for disk buffers? The latter... let the OS have as much as you can for disk buffers. -Yonik -- Regards, Shalin Shekhar Mangar.
Lower Case Filter Factory
Hi, I am using the basic text field in schema.xml. Here is an excerpt. field name=name type=text index=true stored=true multiValued=false omitNorms=true/ and the fieldType text is as follows: fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype When I query: http://localhost:8983/solr/select?q=p* I get results back, but when I query as http://localhost:8983/solr/select?q=P* I get no results. Is there anything wrong im doing? Thanks, Swarag -- View this message in context: http://www.nabble.com/Lower-Case-Filter-Factory-tp18930459p18930459.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Newbie question about memory allocation between solr and OS
: solr is pretty non-standard -- I've written nearly a dozen plugins to : customize it for my particular needs. Yet I've been able to do everything I : need using plugins and without modifying the core code. It works like a : charm. I would *love* to hear more about your use cases for writing plugins... http://www.nabble.com/Seeking-Anecdotes%3A-Solr-Plugins-to18601039.html#a18601039 -Hoss
Re: Snappuller ssh opening and closing multiple times
: After checking message logs (/var/log/messages) we see that the : snappuller script is opening and closing ssh sessions multiple times : within a very short timeframe, in one case an ssh session is being : opened and closed 21 times within .3 seconds. This means that an ssh : login is being executed more than 3000 times daily. Without having : looked at the scripts in-depth can somebody confirm whether this : behavior is normal or not. Is this the cause of the rsync process? you don't nee much depth to see that snappuller only explicitly execs ssh 3 times per invocation. The rest (by process of elimination) must be the normal behavior of the (single) call to rsync. -Hoss
Best strategy for dates in solr-ruby
Hi, I originally used a Ruby Date class for my dates, but found when I set the type to solr.DateField in the solrconfig.xml, it returned a parse error. After that, I switched to Time and it worked fine. However, I now have some dates that are out of the Time range (e.g. 1865) so Date would work better here than time. What is the best strategy here: 1. Use Dates and treat it as a solr.String; 2. Customize the Date class to output a valid solr.DateField string; or 3. Treat it as a string in ruby and handle to/from Date in my model? -- Regards, Ian Connor
Re: External Application (JIVE) : integration
On Aug 11, 2008, at 10:11 AM, Vicky_Dev wrote: Hi I am beginner in Solr. Question: We have two types of search in site. a) First one is : Site search . Since all data is within database (our control environment, this type of search can be implemented using Solr (based on Lucene index) b) Second one is: External application / Third party search--Jive search (Community search) For second type of search web services are exposed. We can pass search query to external application and retrieve results from third party. Now results from both searches needs to be combined and shown to user For e.g. If search query contains Canon then site search will give site results (suppose 90 search results) and Jive search gives 30 items Now on screen we have to shown combined results. Results should be sorted by relevance . Search results shown on screen can contain --first result record from site search , second result record from jive Third from Jive and next one from site search . It entirely depend on relevance to search query strin Is it possible to dynamically add third party results to search results and call function to rearrange search results ? Is it possible, yes, you can just add in a SearchComponent that adds/ sorts the results from Jive into the Solr results. Is it meaningful, doubtful. What does a relevance score in Jive mean in relation to a relevance score for site search? I would guess nothing. -Grant
concurrent optimize and update
Hi all, What happens internally in solr when an optimize/commit request is submitted by one process, and some other process starts submitting Xml documents to add? Is this generally a safe thing to do? Basically I'm continually adding documents to solr, and decided that autocommit / would be a good thing for me to use, so I'm using that every 25000 docs or every 15 minutes. Now I want to do an optimize every 24 hours or so, so I was going to cron that up, but do I also need to stop the indexing processes from submitting xml docs to the update handler while the optimize is taking place? enjoy, -jeremy -- Jeremy Hinegardner [EMAIL PROTECTED]
Re: concurrent optimize and update
On Mon, Aug 11, 2008 at 6:16 PM, Jeremy Hinegardner [EMAIL PROTECTED] wrote: What happens internally in solr when an optimize/commit request is submitted by one process, and some other process starts submitting Xml documents to add? Is this generally a safe thing to do? It's safe... the adds will block until the commit or optimize has finished. -Yonik
Best way to index without diacritics
I have utf-8 content that I wat to index, however I want searches without diacritics to return results. For example, a document with the words nino en mexico should return results like a document with the phrase Niño en México. Ideally, exact diacritic matches should score higher (searching for niño exactly should make a document with niño score higher than a document with nino) Any pointers on how to do this? I found about the /solr/.ISOLatin1AccentFilterFactory but it seems to only strip diacritics from iso-latin characters. How about UTF diacritics? -- _ ___ _ _ _ _ _ _ _ *Ing. Alejandro Garza González* Director, Tecnología e Innovación, Biblioteca Tecnológico de Monterrey, Campus Monterrey Tel.: 52(81) 8358-1400 ext. 4037 Fax: 52(81) 8328-4067 Enlace Intercampus: 80 689 4037 http://biblioteca.mty.itesm.mx El contenido de este mensaje de datos no se considera oferta, propuesta o acuerdo, sino hasta que sea confirmado en documento por escrito que contenga la firma autógrafa del apoderado legal del ITESM. El contenido de este mensaje de datos es confidencial y se entiende dirigido y para uso exclusivo del destinatario, por lo que no podrá distribuirse y/o difundirse por ningún medio sin la previa autorización del emisor original. Si usted no es el destinatario, se le prohíbe su utilización total o parcial para cualquier fin. The content of this data transmission must not be considered an offer, proposal, understanding or agreement unless it is confirmed in a document signed by a legal representative of ITESM. The content of this data transmission is confidential and is intended to be delivered only to the addressees. Therefore, it shall not be distributed and/or disclosed through any means without the authorization of the original sender. If you are not the addressee, you are forbidden from using it, either totally or partially, for any purpose.
Highlighting Output
Martin, I've been over some of the same thoughts you present here in the last few years. The path of least resistance ended up being to deal with the highlighting portion of OCRed images outside of Solr. That's not to say it couldn't or shouldn't be done differently. I briefly even pursued a similar course of action evident in https://issues.apache.org/jira/browse/SOLR-386. This would make it easier if you wanted to write your own highlighter. I'm interested to see what others think of your suggestions. I've forwarded this to the solr-user list. Tricia Original Message Subject:Highlighting Output Date: Mon, 11 Aug 2008 17:21:55 -0400 From: Martin Owens [EMAIL PROTECTED] To: Tricia Williams [EMAIL PROTECTED], [EMAIL PROTECTED] Hello Solr Users, I've been thinking about the highlighting functionality in Solr. I recently had th good fortune to be helped by Tricia Williams with payload issues relating to highlighting. What I see though is that the highlighting functionality is heavily tied to the fragment (highlight context) functionality. This actually makes it interesting to write a plane highlight method that just returns meta data (so some other process can do the actual highlighting in some custom fashion). So is it worth while to make sure that solr is able to do multiple different kinds of highlighting, even if it means passing meta data back in the request? Should we have standard ways to index and read back payload information if we're dealing with pages, books, co-ordinates (for highlighting images) and other meta data which is used for highlights (chat offset, term offset eccettera). I also noticed much of the highlighting code to do with fragments being duplicated in custom code. Other thoughts? does this make things more complex for normal highlighting? Best Regards, Martin Owens
number of matching documents incorrect during postOptimize
Hi all, I'm trying to check that an import using the dataImportHandler was clean before I take a snapshot of the index to be pulled via snappuller to query nodes. One of the checks I do is verify that a certain minimum number of documents are returned for a query. I do this in a script that I'm calling via the postOptimize hook. However, after a full import the numFound results from the query are not accurate until after the postOptimize code completes and so my checks are failing. Glancing at the code this looks non-trivial to fix as the hook call is pretty deep in the call stack. org.apache.solr.handler.dataimport.DataImporter.doFullImport execute eventually calls org.apache.solr.update.UpdateHandler.callPostOptimizeCallbacks One option would be to spawn and background a new job to check the status with an initial sleep to wait for the postOptimize that spawned it to finish. This is pretty ugly and could lead to some race conditions but will probably work. Any better recommendations on how to acheive this functionality? Thanks...Tom
Re: unique key
On Wed, 6 Aug 2008 12:25:34 +1000 Norberto Meijome [EMAIL PROTECTED] wrote: On Tue, 5 Aug 2008 14:41:08 -0300 Scott Swan [EMAIL PROTECTED] wrote: I currently have multiple documents that i would like to index but i would like to combine two fields to produce the unique key. the documents either have 1 or the other fields so by combining the two fields i will get a unique result. is this possible in the solr schema? Hi Scott, you can't do that by the schema - you need to do it when you generate your document, before posting it to SOLR. Hi again, after reading the DataImportHandler documentation, you could do this too with specific configuration in DIH itself. Of course, you have to be using DIH to load data into your SOLR ;) B _ {Beto|Norberto|Numard} Meijome Intellectual: 'Someone who has been educated beyond his/her intelligence' Arthur C. Clarke, from 3001, The Final Odyssey, Sources. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Can't Delete Record
On Mon, 11 Aug 2008 06:48:05 -0700 (PDT) Vj Ali [EMAIL PROTECTED] wrote: i also sends coomit tag as well. maybe you need commit/ instead of coomit ? _ {Beto|Norberto|Numard} Meijome With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. [RFC1925 - section 2, subsection 3] I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
adds / delete within same 'transaction'..
Hello :) I *think* i know the answer, but i'd like to confirm : Say I have docid1id/nameold/name/doc already indexed and commited (ie, 'live' ) What happens if I issue: deleteid1/id/delete adddocid1/idnamenew/name/doc commit/ will delete happen first, and then the add, or could it be that the add happens before delete, in which case i end up with no more doc id=1 ? thanks!! B _ {Beto|Norberto|Numard} Meijome Anyone who isn't confused here doesn't really understand what's going on. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.