Date Search with q query parameter
Hi, I am facing an issue with the date field, I have in my records. e.g. I am using q query parameter and passing some string as search criteria like test. While creating query with q parameter, how query forms is: column1:test | column2:test | column3:test . ... I have one column as date column, which is appended with _dt like column4_dt. Now, when it creates the query like column1:test | column2:test | column3:test | column4_dt:test Here it throws an exception saying Invalid date format. Please suggest how I can prevent this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html Sent from the Solr - User mailing list archive at Nabble.com.
Highlighting the searched term in resultset
I was wondering if there is any way of highlighting the searched term in the resultset directly instead of having it as a separate lst element. Doing it through xsl transformation would be one way. Has anybody implemented any other better solution ? e.g result name=response numFound=293 start=0 doc str name=item_typeHIGHLIGHTEDiPhone/HIGHLIGHTED/str str name=keywordsiphone sell buy/str date name=last_modified2007-11-20T05:36:29Z/date date name=releasedate2007-11-17T06:00:00Z/date str name=typeARTICLE/str /doc /result TIA.
Re: Date Search with q query parameter
Is your final query in this format ? col1:[2009-01-01T00:00:00Z+TO+2009-01-01T23:59:59Z] From: dabboo ag...@sapient.com To: solr-user@lucene.apache.org Sent: Thursday, March 12, 2009 12:27:48 AM Subject: Date Search with q query parameter Hi, I am facing an issue with the date field, I have in my records. e.g. I am using q query parameter and passing some string as search criteria like test. While creating query with q parameter, how query forms is: column1:test | column2:test | column3:test . ... I have one column as date column, which is appended with _dt like column4_dt. Now, when it creates the query like column1:test | column2:test | column3:test | column4_dt:test Here it throws an exception saying Invalid date format. Please suggest how I can prevent this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 1.3 and Solr 1.4 difference?
Hi What is the exact difference between Solr 1.3 and Solr 1.4 (Nightly build as of now)?? Heard SolrJ is not part of Solr and performance is great in Solr 1.4 Please tell exactly what all going to differ in Solr 1.4 If possible please provide a pointer which describes the same. - Regards, Praveen -- View this message in context: http://www.nabble.com/Solr-1.3-and-Solr-1.4-difference--tp22471477p22471477.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem using DIH templatetransformer to create uniqueKey: solved
Folks, Template transformer will fail to return if a variable if undefined, however the regex transformer does still return. So where the following would fail:- field column=id template=${jc.fileAbsolutePath}${x.vurl} / This can be used instead:- field column=id regex=(.*) relpaceWith=$1${x.vurl} sourceColName=fileAbsolutePath / So I guess we have the best of both worlds! Fergus. Hmmm. Just gave that a go! No luck But how many layers of defaults do we need? Rgds Fergus What about having the template transformer support ${field:default} syntax? I'm assuming it doesn't support that currently right? The replace stuff in the config files does though. Erik On Feb 13, 2009, at 8:17 AM, Fergus McMenemie wrote: Paul, Following up your usenet sussgetion: field column=id template=${jc.fileAbsolutePath}${x.vurl} ignoreMissingVariables=true/ and to add more to what I was thinking... if the field is undefined in the input document, but the schema.xml does allow a default value, then TemplateTransformer can use the default value. If there is no default value defined in schema.xml then it can fail as at present. This would allow or any other value to be fed into TemplateTransformer, and still enable avoidance of the partial strings you referred to. Regards Fergus. Hello, templatetransformer behaves rather ungracefully if one of the replacement fields is missing. Looking at TemplateString.java I see that left to itself fillTokens would replace a missing variable with . It is an extra check in TemplateTransformer that is throwing the warning and stopping the row being returned. Commenting out the check seems to solve my problem. Having done this, an undefined replacement string in TemplateTransformer is replaced with . However a neater fix would probably involve making use of the default value which can be assigned to a row? in schema.xml. I am parsing a single XML document into multiple separate solr documents. It turns out that none of the source documents fields can be used to create a uniqueKey alone. I need to combine two, using template transformer as follows: entity name=x dataSource=myfilereader processor=XPathEntityProcessor url=${jc.fileAbsolutePath} rootEntity=true stream=false forEach=/record | /record/mediaBlock transformer =DateFormatTransformer,TemplateTransformer,RegexTransformer field column=fileAbsolutePathtemplate=$ {jc.fileAbsolutePath} / field column=fileWebPath regex=$ {dataimporter.request.installdir}(.*) replaceWith=/ford$1 sourceColName=fileAbsolutePath/ field column=id template=$ {jc.fileAbsolutePath}${x.vurl} / field column=vurlxpath=/record/mediaBlock/ mediaObject/@vurl / The trouble is that vurl is only defined as a child of /record/ mediaBlock so my attempt to create id, the uniqueKey fails for the parent document /record I am hacking around with TemplateTransformer.java to sort this but was wondering if there was a good reason for this behavior. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: Solr 1.3 and Solr 1.4 difference?
here is the exhaustive list of all changes in 1.4 http://svn.apache.org/repos/asf/lucene/solr/trunk/CHANGES.txt On Thu, Mar 12, 2009 at 3:29 PM, Praveen Kumar Jayaram praveen198...@gmail.com wrote: Hi What is the exact difference between Solr 1.3 and Solr 1.4 (Nightly build as of now)?? Heard SolrJ is not part of Solr and performance is great in Solr 1.4 Please tell exactly what all going to differ in Solr 1.4 If possible please provide a pointer which describes the same. - Regards, Praveen -- View this message in context: http://www.nabble.com/Solr-1.3-and-Solr-1.4-difference--tp22471477p22471477.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul
Re: Date Search with q query parameter
On Thu, Mar 12, 2009 at 4:39 PM, dabboo ag...@sapient.com wrote: Hi, I am able to rectify that exception but now what I am looking for is : How I can pass the value to the date field to search for the record of a specific date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I will pass the value with the column name. If I pass the value it throws an exception saying that it is expecting TO .. The format for range search is your_date_field:[minDate TO maxDate] and for a normal term query it is your_date_field:the_date Each of the dates should be in the format described in the example schema.xml -- Regards, Shalin Shekhar Mangar.
Re: Date Search with q query parameter
Hi, Date range query is working fine for me. This is the query I entered. q=productPublicationDate_product_dt:1993-02-01T12:00:00Zversion=2.2start=0rows=10indent=onqt=dismaxrequest It threw this exception: type Status report message Invalid Date String:'1993-02-01t12' description The request sent by the client was syntactically incorrect (Invalid Date String:'1993-02-01t12'). thanks, Amit Garg Shalin Shekhar Mangar wrote: On Thu, Mar 12, 2009 at 4:39 PM, dabboo ag...@sapient.com wrote: Hi, I am able to rectify that exception but now what I am looking for is : How I can pass the value to the date field to search for the record of a specific date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I will pass the value with the column name. If I pass the value it throws an exception saying that it is expecting TO .. The format for range search is your_date_field:[minDate TO maxDate] and for a normal term query it is your_date_field:the_date Each of the dates should be in the format described in the example schema.xml -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22474608.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Date Search with q query parameter
Hi, I am able to rectify that exception but now what I am looking for is : How I can pass the value to the date field to search for the record of a specific date value. e.g. I want to retrieve all the records of Jan 01, 2007. How I will pass the value with the column name. If I pass the value it throws an exception saying that it is expecting TO .. Please suggest. thanks, Amit Garg Venu Mittal wrote: Is your final query in this format ? col1:[2009-01-01T00:00:00Z+TO+2009-01-01T23:59:59Z] From: dabboo ag...@sapient.com To: solr-user@lucene.apache.org Sent: Thursday, March 12, 2009 12:27:48 AM Subject: Date Search with q query parameter Hi, I am facing an issue with the date field, I have in my records. e.g. I am using q query parameter and passing some string as search criteria like test. While creating query with q parameter, how query forms is: column1:test | column2:test | column3:test . ... I have one column as date column, which is appended with _dt like column4_dt. Now, when it creates the query like column1:test | column2:test | column3:test | column4_dt:test Here it throws an exception saying Invalid date format. Please suggest how I can prevent this. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22471072.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Date-Search-with-q-query-parameter-tp22471072p22473029.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tomcat holding deleted snapshots until it's restarted
The old IndexSearcher is beeing closed correctly: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 main hossman wrote: : If the problem is not there the other thing that comes to my mind is : lucene2.9-dev... maybe there's a problem closing indexWriter?... opiously : it's just a thought. you never answered yoniks question about wether you see any Closing Searcher messagges in your log, also it's useful to know what you see in the CORE section when you look at stats.jsp ... typically the main searcher is listed there twice, but during warming you'll see the old searcher as well ... if older searchers aren't getting closed for some reason, they should be listed there. i'd start by confirming/ruling out hte old searchers before speculating about the indexwriter or other problems. : On a quiet system, you should see the original searcher closed right : after the new searcher is registered. : : Example: : Mar 11, 2009 2:22:25 PM org.apache.solr.core.SolrCore registerSearcher : INFO: [] Registered new searcher searc...@1f1cbf6 main : Mar 11, 2009 2:22:25 PM org.apache.solr.search.SolrIndexSearcher close : INFO: Closing searc...@acdd02 main -Hoss -- View this message in context: http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22475001.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom path for solr lib and data folder
Hoss Assume my current working directory is C:/MyApplication/searchApp and in the solr.xml i am specifying C:/lib as shared lib, then the console output contains the following line: INFO: loading shared library: C:\MyApplication\searchApp\C:\lib Thanks con hossman wrote: Adding ' + jars[j].toString() + ' to Solr classloaderAdding ' + jars[j].toString() + ' to Solr classloader : But how can i redirect solr to a seperate lib directrory that is outside of : the solr.home : : Is this possible in solr 1.3 : : I don't believe it is possible (but please correct me if I'm wrong). From : SolrResourceLoader: : :log.info(Solr home set to ' + this.instanceDir + '); :this.classLoader = createClassLoader(new File(this.instanceDir + lib/), : parent); : : So only a lib/ under Solr home directory is used. It would be a nice that's the lib directory specific to the core (hence it's relative the instanceDir). In con's original post he was claiming to have problems getting solr.xml's sharedLib option to point to an absolute path ... this should work fine. con: when your solr.xml you should see an INFO message starting with loading shared library:... -- what path is listed on that line? your sharedLib=%COMMON_LIB% example won't work (for the reasons Noble mentioned) but your sharedLib=C:\lib should work (assuming that path exists) and then immediately following the log message i mentioned above, you should see INFO messages like... Adding file:///...foo.jar to Solr classloader ...for each jar in that directory. if there are none, or the directory can't be found you might see Reusing parent classloader or Can't construct solr lib class loader messages instead. what do you see in your logs? -Hoss -- View this message in context: http://www.nabble.com/Custom-path-for-solr-lib-and-data-folder-tp22450530p22475244.html Sent from the Solr - User mailing list archive at Nabble.com.
How to remove stemming from the analyzer - Finding blah when searching for blah*
Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains blah, but when I search for blah* it cannot find it, whereas if I search for bla* it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find blah when I search for blah*? fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType I have tried using the textTight type to no avail. Most of the fields in my documents have this structure: DOC1 field gene name:brca2 DOC2 field gene name:brca23 If I searched for brca2* I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
Re: How to remove stemming from the analyzer - Finding blah when searching for blah*
Remove the EnglishPorterFilterFactory from your text analyzer configuration (both index and query sides). And reindex all documents. Erik On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains blah, but when I search for blah* it cannot find it, whereas if I search for bla* it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find blah when I search for blah*? fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType I have tried using the textTight type to no avail. Most of the fields in my documents have this structure: DOC1 field gene name:brca2 DOC2 field gene name:brca23 If I searched for brca2* I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
Re: How to remove stemming from the analyzer - Finding blah when searching for blah*
Thanks for your answer, I am trying now with this custom text field: fieldType name=textIntact class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=0 catenateWords=0 catenateNumbers=0 catenateAll=0 expand=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType And still it does not find blah when using the wildcard and searching for blah*. Am I missing something? Thanks, Bruno 2009/3/12 Erik Hatcher e...@ehatchersolutions.com Remove the EnglishPorterFilterFactory from your text analyzer configuration (both index and query sides). And reindex all documents. Erik On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains blah, but when I search for blah* it cannot find it, whereas if I search for bla* it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find blah when I search for blah*? fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType I have tried using the textTight type to no avail. Most of the fields in my documents have this structure: DOC1 field gene name:brca2 DOC2 field gene name:brca23 If I searched for brca2* I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
How to correctly boost results in Solr Dismax query
Hi, I have managed to build an index in Solr which I can search on keyword, produce facets, query facets etc. This is all working great. I have implemented my search using a dismax query so it searches predetermined fields. However, my results are coming back sorted by score which appears to be calculated by keyword relevancy only. I would like to adjust the score where fields have pre-determined values. I think I can do this with boost query and boost functions but the documentation here: http://wiki.apache.org/solr/DisMaxRequestHandler#head-6862070cf279d9a09bdab971309135c7aea22fb3 Is not particularly helpful. I tried adding adding a bq argument to my search: bq=media:DVD^2 (yes, this is an index of films!) but I find when I start adding more and more: bq=media:DVD^2bq=media:BLU-RAY^1.5 I find the negative results - e.g. films that are DVD but are not BLU-RAY get negatively affected in their score. In the end it all seems to even out and my score is as it was before i started boosting. I must be doing this wrong and I wonder whether boost function comes in somewhere. Any ideas on how to correctly use boost? Cheers, Pete -- Pete Smith Developer No.9 | 6 Portal Way | London | W3 6RU | T: +44 (0)20 8896 8070 | F: +44 (0)20 8896 8111 LOVEFiLM.com
Re: How to remove stemming from the analyzer - Finding blah when searching for blah*
What is the full query you're issuing to Solr and the corresponding request handler configuration? Chances are you're using the dismax query parser, which does not support wildcards. Other things to check, be sure you've tied the field to your new textIntact type, and that you're searching that field (see defaultField in schema.xml). Try something like /solr/select?q=field_name:blah* Erik On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote: Thanks for your answer, I am trying now with this custom text field: fieldType name=textIntact class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=0 catenateWords=0 catenateNumbers=0 catenateAll=0 expand=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType And still it does not find blah when using the wildcard and searching for blah*. Am I missing something? Thanks, Bruno 2009/3/12 Erik Hatcher e...@ehatchersolutions.com Remove the EnglishPorterFilterFactory from your text analyzer configuration (both index and query sides). And reindex all documents. Erik On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains blah, but when I search for blah* it cannot find it, whereas if I search for bla* it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find blah when I search for blah*? fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType I have tried using the textTight type to no avail. Most of the fields in my documents have this structure: DOC1 field gene name:brca2 DOC2 field gene name:brca23 If I searched for brca2* I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
RE: Combination of EmbeddedSolrServer and CommonHttpSolrServer
Hi Shalin Shekhar Mangar, Thanks for your inputs. Please see my comments below. I wish to know if there is any user who used EmbeddedSolrServer for indexing and CommonsHttpSolrServer for search. I have found that this combination offers better performance for indexing. Searching becomes flexible as you can search from more number of http clients simultaneously. Does anyone have any related performance data? Thanks, Ajit -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Wednesday, March 11, 2009 7:24 PM To: solr-user@lucene.apache.org Subject: Re: Combination of EmbeddedSolrServer and CommonHttpSolrServer On Wed, Mar 11, 2009 at 6:37 PM, Kulkarni, Ajit Kamalakar ajkulka...@ptc.com wrote: If we index the documents using CommonsHttpSolrServer and search using the same, we get the updated results That means we can search the latest added document as well even if it is not committed to the file system That is not possible. Without calling commit, new documents will not be visible to a searcher. Ajit: When I tested using CommonsHttpSolrServer for indexing as well as searching, I could search the latest added document through solr admin page. I could also search the document through CommonsHttpSolrServer without explicitly calling commit. I am even more surprised to see the same result by using EmbeddedSolrServer for indexing and for searching CommonsHttpSolrServer. I used embeddedSolrServer = new EmbeddedSolrServer(SolrCore.getSolrCore()); which is deprecated API. For this I did not need to call commit on CommonsHttpSolrServer to get latest document searched on either solr admin page or even programmatically through CommonsHttpSolrServer However if I use CoreContainer multicore = new CoreContainer(); File home = new File( getSolrHome() ); File f = new File( home, solr.xml ); multicore.load( getSolrHome(), f ); embeddedSolrServer = new EmbeddedSolrServer( multicore, SolrIndexConstants.DEFAULT_CORE ); I had to use commit on CommonsHttpSolrServer to search the latest added documents and the document was available through solr admin page only when I programatcaaly searched after calling commit on CommonsHttpSolrServer This is consistent with what you mentioned above. So it looks like there is some kind of cache that is used by both index and search logic inside solr for a given SolrServer components (e. g. CommonsHttpSolrServer, EmbeddedSolrServer) Indexing does not create any cache. The caching is done only by the searcher. The old searcher/cache is discarded and a new searcher/cache is created when you call commit. Setting autoWarmCount on the caches in solrconfig.xml makes the new searcher run some of the most recently used queries on the old searcher to warm up the new cache. Calling commit on the SolrServer to synch with the index data may not be good option as I suppose it to be expensive operation. It is the only option. But you may be able to make the operation cheaper by tweaking the autowarmCount on the caches (this is specified in solrconfig.xml). However, caches are important for good search performance. Depending on your search traffic, you'll need to find a sweet spot. The cache and hard disk data synchronization should be independent of the SolrServer instances managed by Solr Web Application inside tomcat. SolrServer is not really a server in itself. It is (a pointer to?) a server being used by a solrj client. The CommonsHttpSolrServer refers to a remote server url and makes calls through HTTP. SolrCore is the internal class which manages the state of the server. A SolrCore is created by the solr webapp. When you create another SolrCore for use by EmbeddedSolrServer, they do not know about each other. Therefore you need to notify it if you change the index through another core. Ajit: If the same JVM is managing responding searchers for EmbeddedSolrServer as well as CommonsHttpSolrServer, then why can't responding searcher be same? I understand that EmbeddedSolrServer and CommonsHttpSolrServer clients are separate but if searchers are managed in same JVM, theoretically we should be able to make singleton searcher attached to every kind of SolrServer. This searcher should be listener for indexer. Since searching is read operation, there won't be any threading or scalability issue but indexer should be one Since I don't have enough knowledge about solr and lucene so I may be totally wrong! The issue still will be that EmbeddedSolrServer may directly access hard index data as it may bypass the Solr web app totally I am embedding tomcat in my RMI server. The RMI Server is going to use EmbeddedSolrServer and it also hosts the Solr WebApp inside its tomcat instance So I guess I should be able to manage a singleton cache that is given to both,
Operators and Minimum Match with Dismax handler
Hi All, I have a question regarding the dismax handler and minimum match (mm=) I have an index which we are setting the default operator to AND. Am I right in saying that using the dismax handler, the default operator in the schema file is effectively ignored? (This is the conclusion I've made from testing myself) So I have set the mm value to 100% The issue I have with this, is that if I want to include an OR in my phrase, these are effectively getting ignored. The parser is still trying to match 100% of the search terms e.g. 'lucene OR query' still only finds matches for 'lucene AND query' the parsed query is: +(((drug_name:lucen) (drug_name:queri))~2) () I know I could programatically set the mm=0 if my phrase contains certain keywords, however this would get very complicated with more terms in the phrase (I'd have to preserve/inject operators to keep my default) and I assume I would effectively be duplicating what dismax handler does for the most part already. Does anyone have any advise as to how I could deal with this kkind of problem? Thanks Waseem
Re: Solr 1.3; Data Import w/ Dynamic Fields
I was successful at distributing the Solr-1.4-DEV data import functionality within the Solr 1.3 war. 1. Copy the data import’s src directory from 1.4 to 1.3. 2. Made sure to used the data import’s build.xml already existing in Solr 1.3 3. Commented out all code within #SolrWriter.rollback method 4. Commented out the following import statements from #SolrWriter #import org.apache.solr.update.RollbackUpdateCommand; 5. Copied required libraries for logging from 1.4/lib to 1.3/lib slf4j-api-1.5.5.jar slf4j-jdk14-1.5.5.jar I was planning on replacing the Solr 1.4 logging scheme to the style in Solr 1.3, but that was unnecessary work. Continuing my testing with this customized distributing. Thanks again, Wesley. On 3/11/09 6:35 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Mar 11, 2009 at 4:01 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: I guess you can take the trunk and comment out the contents of SolrWriter#rollback() and it should work with Solr1.3 I agree. Rollback is the only feature which depends on enhancements in Solr/Lucene libraries. So if you remove this feature, everything else should work fine with 1.3 -- Regards, Shalin Shekhar Mangar.
Is wiki page still accurate
Folks, Is this section title Full Import Example on http://wiki.apache.org/solr/DataImportHandler still accurate? The steps referring to the example-solr-home.jar and the SOLR-469 patch seem out of date with where 1.4 is today? Seems like the example-DIH stuff is simpler/more direct example??? Eric - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Free/Busy: http://tinyurl.com/eric-cal
Re: How to remove stemming from the analyzer - Finding blah when searching for blah*
Thanks again. This is the default request handler: requestHandler name=standard class=solr.SearchHandler default=true !-- default values for query parameters -- lst name=defaults str name=echoParamsexplicit/str /lst /requestHandler Doing this query: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh Find 1 result. The term Nefh is found in the field mitab. Doing: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh* Finds nothing. I have realised that Ne* of Nef* do not return results as well, using the textIntact type... Thank you, Bruno 2009/3/12 Erik Hatcher e...@ehatchersolutions.com What is the full query you're issuing to Solr and the corresponding request handler configuration? Chances are you're using the dismax query parser, which does not support wildcards. Other things to check, be sure you've tied the field to your new textIntact type, and that you're searching that field (see defaultField in schema.xml). Try something like /solr/select?q=field_name:blah* Erik On Mar 12, 2009, at 9:09 AM, Bruno Aranda wrote: Thanks for your answer, I am trying now with this custom text field: fieldType name=textIntact class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=0 catenateWords=0 catenateNumbers=0 catenateAll=0 expand=0 splitOnCaseChange=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType And still it does not find blah when using the wildcard and searching for blah*. Am I missing something? Thanks, Bruno 2009/3/12 Erik Hatcher e...@ehatchersolutions.com Remove the EnglishPorterFilterFactory from your text analyzer configuration (both index and query sides). And reindex all documents. Erik On Mar 12, 2009, at 8:28 AM, Bruno Aranda wrote: Hi, I am trying to disable stemming from the analyzer, but I am not sure how to do it. For instance, I have a field that contains blah, but when I search for blah* it cannot find it, whereas if I search for bla* it does. I was using the text type field, from the example schema.xml. How should I modify it so that stemming is not done and I can find blah when I search for blah*? fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !-- Case insensitive stop word removal. add enablePositionIncrements=true in both the index and query analyzers to leave a 'gap' for more accurate phrase queries. -- filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType I have tried using the textTight type to no avail. Most of the fields in my documents have this structure: DOC1 field gene name:brca2 DOC2 field gene name:brca23 If I searched for brca2* I would like to find both documents. My field values normally contain colons ':' that should be used as stop words. Thank you in advance, Bruno
Re: How to remove stemming from the analyzer - Finding blah when searching for blah*
On Mar 12, 2009, at 10:47 AM, Bruno Aranda wrote: Doing this query: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh Find 1 result. The term Nefh is found in the field mitab. Doing: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh* Finds nothing. I have realised that Ne* of Nef* do not return results as well, using the textIntact type... Ah... the problem is that wildcarded query terms do not get analyzed, nor do they get lowercased (this is an open issue with Solr to at least make lowercasing configurable, Lucene supports it). Try lowercasing in your query client, that should do the trick. Erik
Re: How to remove stemming from the analyzer - Finding blah when searching for blah*
Thank you! Next time I will remind not to change the words to make the example simpler... blah is not the same as Nefh :-) Thanks, Bruno 2009/3/12 Erik Hatcher e...@ehatchersolutions.com On Mar 12, 2009, at 10:47 AM, Bruno Aranda wrote: Doing this query: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh Find 1 result. The term Nefh is found in the field mitab. Doing: http://localhost:18080/solr/core_pub/select/?q=mitab:Nefh* Finds nothing. I have realised that Ne* of Nef* do not return results as well, using the textIntact type... Ah... the problem is that wildcarded query terms do not get analyzed, nor do they get lowercased (this is an open issue with Solr to at least make lowercasing configurable, Lucene supports it). Try lowercasing in your query client, that should do the trick. Erik
Programmatic access to other handlers
Hi, I've designed a front handler that will send request to other handlers and return a aggregated response. Inside this handler, I call other handlers like this (inside the method handleRequestBody): SolrCore core = req.getCore(); SolrRequestHandler mlt = core.getRequestHandler(/mlt); ModifiableSolrParams params = new ModifiableSolrParams(req.getParams()); params.set(mlt.fl, nFullText); req.setParams(params); mlt.handleRequest(req, rsp); First question: is this the recommended way to call another handler? Second question: how could I call a handler of another core? -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22477731.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tomcat holding deleted snapshots until it's restarted
I have noticed that the first time I execute full import (having an old index in the index folder) once it is done, the old indexsearcher will be closed: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 The problem is that if I do another full-import... the old searcher will not be closed, there will just appear the line: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main If I keep doing full-imports the ols searchers will never be closed. seems that they are just closed in the first full import... Does it mean something to anyone? Marc Sturlese wrote: The old IndexSearcher is beeing closed correctly: 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.core.SolrCore - [core_01] Registered new searcher searc...@c6692 main 2009-03-12 13:05:06,200 [pool-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Closing searc...@1c5cd7 main hossman wrote: : If the problem is not there the other thing that comes to my mind is : lucene2.9-dev... maybe there's a problem closing indexWriter?... opiously : it's just a thought. you never answered yoniks question about wether you see any Closing Searcher messagges in your log, also it's useful to know what you see in the CORE section when you look at stats.jsp ... typically the main searcher is listed there twice, but during warming you'll see the old searcher as well ... if older searchers aren't getting closed for some reason, they should be listed there. i'd start by confirming/ruling out hte old searchers before speculating about the indexwriter or other problems. : On a quiet system, you should see the original searcher closed right : after the new searcher is registered. : : Example: : Mar 11, 2009 2:22:25 PM org.apache.solr.core.SolrCore registerSearcher : INFO: [] Registered new searcher searc...@1f1cbf6 main : Mar 11, 2009 2:22:25 PM org.apache.solr.search.SolrIndexSearcher close : INFO: Closing searc...@acdd02 main -Hoss -- View this message in context: http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22478204.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 1.4: filter documens using fields
Hi all! I'm using StandardRequestHandler and I wanted to filter results by two fields in order to avoid duplicate results (in this case the documents are very similar, with differences in fields that are not returned in a query response). For example, considering the response: doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3142/long str name=topologynameLocais/str /doc doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3141/long str name=topologynameinventario/str /doc doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3141/long str name=topologynameinventario/str /doc doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3140/long str name=topologynameCPE/str /doc doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3140/long str name=topologynameCPE/str /doc I wanted to filter by: instancekey and topologyid in order to get the following response: doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3142/long str name=topologynameLocais/str /doc doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3141/long str name=topologynameinventario/str /doc doc long name=instancekey285/long str name=instancename186_Testing/str long name=topologyid3140/long str name=topologynameCPE/str /doc I'm manage to do the filtering in the client, but then the paging doesn't work as it should (some pages may contain more duplicated results than others). Is there a way (query or other RequestHandler) to do this? Thanks, Rui Pereira
Re: Is wiki page still accurate
On Thu, Mar 12, 2009 at 8:05 PM, Eric Pugh ep...@opensourceconnections.comwrote: Folks, Is this section title Full Import Example on http://wiki.apache.org/solr/DataImportHandler still accurate? The steps referring to the example-solr-home.jar and the SOLR-469 patch seem out of date with where 1.4 is today? Seems like the example-DIH stuff is simpler/more direct example??? Yikes! I'll fix it. -- Regards, Shalin Shekhar Mangar.
RE: Replication in 1.3
Just so I'm clear on it, do you mean Windows replication via Cygwin is not supported or not possible? If it's possible, I'm just curious if anyone else on the list has experience with it. Thanks, Laurent -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Wednesday, March 11, 2009 5:03 PM To: solr-user@lucene.apache.org Subject: Re: Replication in 1.3 On Wed, Mar 11, 2009 at 1:29 PM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: I'm hoping to use Solr version 1.4 but in the meantime I'm trying to get replication to work in version 1.3. I'm running Tomcat as a Windows service and have Cygwin installed. The rsync method of replication is not supported under Windows (due to differing OS/filesystem semantics). The Java-based synchronization in Solr 1.4 does support Windows though. -Yonik http://www.lucidimagination.com
Re: Is wiki page still accurate
On Thu, Mar 12, 2009 at 10:04 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Thu, Mar 12, 2009 at 8:05 PM, Eric Pugh ep...@opensourceconnections.com wrote: Folks, Is this section title Full Import Example on http://wiki.apache.org/solr/DataImportHandler still accurate? The steps referring to the example-solr-home.jar and the SOLR-469 patch seem out of date with where 1.4 is today? Seems like the example-DIH stuff is simpler/more direct example??? Yikes! I'll fix it. I've updated the instructions. Thanks for reporting this, Eric. -- Regards, Shalin Shekhar Mangar.
Re: Replication in 1.3
On Thu, Mar 12, 2009 at 12:34 PM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: Just so I'm clear on it, do you mean Windows replication via Cygwin is not supported or not possible? Not really possible - the strategy the scripts use won't work on Windows because of the different filesystem semantics. Things like the fact that you can make a hard link, but you can't move or delete any of the links to an open file like you can with UNIX. -Yonik http://www.lucidimagination.com
Re: Tomcat holding deleted snapshots until it's restarted
: I have noticed that the first time I execute full import (having an old index : in the index folder) once it is done, the old indexsearcher will be closed: ... : The problem is that if I do another full-import... the old searcher will not : be closed, there will just appear the line: ... : If I keep doing full-imports the ols searchers will never be closed. seems : that they are just closed in the first full import... : Does it mean something to anyone? Hmmm... sounds like maybe DIH is triggering something weird. Just to clarify: a) what does the stats page show (in terms of the number of Searchers listed in the CORE section) after a couple of full imports? b) can you reproduce this doing full builds even with replication disabled? c) can you reproduce this using the example DIH configs? -Hoss
Re: Tomcat holding deleted snapshots until it's restarted
: Just to clarify: : a) what does the stats page show (in terms of the number of : Searchers listed in the CORE section) after a couple of full imports? After 4 full-imports it will show 3 indexsearchers. I have also printed the var _searchers from SolrCore.java and it shows me 3 indexsearchers. : b) can you reproduce this doing full builds even with replication : disabled? I have replication disabled. I use solr collection distribution but for all this tests I am not even using that. I just use one machine and index just in there : c) can you reproduce this using the example DIH configs? My configs look really similar than defaults. I get data from mysql database in data-config.xml. Solrconfig.xml has the caches and warmings same as defaults. I have disabled solrdeletionpolicystuff (and replication aswell). I have checked the oficial 1.3 release and I hace seen DirectUpdateHandler2.java is quite different that the one in the nightlys. In the commit void... 1.3 is calling a closeSearcher function : public void commit(CommitUpdateCommand cmd) throws IOException { if (cmd.optimize) { optimizeCommands.incrementAndGet(); } else { commitCommands.incrementAndGet(); } Future[] waitSearcher = null; if (cmd.waitSearcher) { waitSearcher = new Future[1]; } boolean error=true; iwCommit.lock(); try { log.info(start +cmd); if (cmd.optimize) { closeSearcher(); openWriter(); writer.optimize(cmd.maxOptimizeSegments); } closeSearcher(); closeWriter(); These closeSearcher function doesn't exist in the nightly (I supose all the proces works in a different way now). It seems that once DataImportHandler does the first import touches something that makes IndexSearchers to not set free never again. hossman wrote: : I have noticed that the first time I execute full import (having an old index : in the index folder) once it is done, the old indexsearcher will be closed: ... : The problem is that if I do another full-import... the old searcher will not : be closed, there will just appear the line: ... : If I keep doing full-imports the ols searchers will never be closed. seems : that they are just closed in the first full import... : Does it mean something to anyone? Hmmm... sounds like maybe DIH is triggering something weird. Just to clarify: a) what does the stats page show (in terms of the number of Searchers listed in the CORE section) after a couple of full imports? b) can you reproduce this doing full builds even with replication disabled? c) can you reproduce this using the example DIH configs? -Hoss -- View this message in context: http://www.nabble.com/Tomcat-holding-deleted-snapshots-until-it%27s-restarted-tp22451252p22481571.html Sent from the Solr - User mailing list archive at Nabble.com.
stemming (maybe?) question
is it possible to make solr think that omeara and o'meara are the same thing? -jsd-
RE: Replication in 1.3
Thanks for the reply. Hopefully 1.4 will come soon enough so that we can still use Windows. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, March 12, 2009 9:55 AM To: solr-user@lucene.apache.org Subject: Re: Replication in 1.3 On Thu, Mar 12, 2009 at 12:34 PM, Vauthrin, Laurent laurent.vauth...@disney.com wrote: Just so I'm clear on it, do you mean Windows replication via Cygwin is not supported or not possible? Not really possible - the strategy the scripts use won't work on Windows because of the different filesystem semantics. Things like the fact that you can make a hard link, but you can't move or delete any of the links to an open file like you can with UNIX. -Yonik http://www.lucidimagination.com
fl wildcards
If I wanted to hack Solr so that it has the ability to process wildcards for the field list parameter (fl), where would I look? (Perhaps I should look on the solr-dev mailing list, but since I am already on this one I thought I would start here). Thanks! -- -a Ideally, a code library must be immediately usable by naive developers, easily customized by more sophisticated developers, and readily extensible by experts. -- L. Stein
Re: Tomcat holding deleted snapshots until it's restarted
On Thu, Mar 12, 2009 at 1:34 PM, Marc Sturlese marc.sturl...@gmail.com wrote: : Just to clarify: : a) what does the stats page show (in terms of the number of : Searchers listed in the CORE section) after a couple of full imports? After 4 full-imports it will show 3 indexsearchers. I have also printed the var _searchers from SolrCore.java and it shows me 3 indexsearchers. Definitely seems like a bug somewhere... Could you try a recent nightly build to see if it's fixed or not? -Yonik http://www.lucidimagination.com
Adding authentication Token to the CommonsHttpSolrServer
Hi, We have installed the Solr in Tomcat server and enabled the security constraint at the Tomcat level.. We require to pass the authentication token(cookie) to the search call that is made using CommonsHttpSolrServer. Would like to know how can I add the token to the CommonsHttpSolrServer. Appreciate any idea on this. Thanks. Karthik
Re: stemming (maybe?) question
On Thu, Mar 12, 2009 at 1:36 PM, Jon Drukman jdruk...@gmail.com wrote: is it possible to make solr think that omeara and o'meara are the same thing? WordDelimiter would handle it if the document had o'meara (but you may or may not want the other stuff that comes with WordDelimiterFilter). You could also use a PatternReplaceFilter to normalize tokens like this. -Yonik http://www.lucidimagination.com
DIH outer joins
I have queries with outer joins defined in some entities and for the same root object I can have two or more lines with different objects, for example: Taking the following 3 tables, andquery defined in the entity with outer joins between tables: Table1 - Table2 - Table3 I can have the following lines returned by the query: Table1Instance1 - Table2Instance1 - Table3Instance1 Table1Instance1 - Table2Instance1 - Table3Instance2 Table1Instance1 - Table2Instance2 - Table3Instance3 Table1Instance2 - Table2Instance3 - Table3Instance4 I wanted to have a single document per root object instance (in this case per Table1 instance) but with the values from the different lines returned. Is it possible to have this behavior in DataImportHandler? How? Thanks in advance, Rui Pereira
Re: Programmatic access to other handlers
I found this code to access other core from my custom requesthandler: CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer cores = initializer.initialize(); SolrCore otherCore = cores.getCore(otherCore); It seems to work with some little testing. But is it a recommended approach? Pascal Dimassimo wrote: Hi, I've designed a front handler that will send request to other handlers and return a aggregated response. Inside this handler, I call other handlers like this (inside the method handleRequestBody): SolrCore core = req.getCore(); SolrRequestHandler mlt = core.getRequestHandler(/mlt); ModifiableSolrParams params = new ModifiableSolrParams(req.getParams()); params.set(mlt.fl, nFullText); req.setParams(params); mlt.handleRequest(req, rsp); First question: is this the recommended way to call another handler? Second question: how could I call a handler of another core? -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html Sent from the Solr - User mailing list archive at Nabble.com.
DIH use of the ?command=full-import entity= command option
Hello, Can anybody describe the intended purpose, or provide a few examples, of how the DIH entity= command option works. Am I supposed to build a data-conf.xml file which contains many different alternate entities.. or Regards -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: Programmatic access to other handlers
Thanks ryantxu for your answer. I implement the interface and it returns me the current core. But how is it different from doing request.getCore() from handleRequestBody()? And I don't see how this can give me access to other cores. I think that what I need is to get access to an instance of CoreContainer, so I can call getCore(name) and getAdminCore to manage the different cores. So I'm wondering if this is a good way to get that instance: CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer cores = initializer.initialize(); ryantxu wrote: If you are doing this in a RequestHandler, implement SolrCoreAware and you will get a callback with the Core http://wiki.apache.org/solr/SolrPlugins#head-8b3ac1fc3584fe1e822924b98af23d72b02ab134 On Mar 12, 2009, at 3:04 PM, Pascal Dimassimo wrote: I found this code to access other core from my custom requesthandler: CoreContainer.Initializer initializer = new CoreContainer.Initializer(); CoreContainer cores = initializer.initialize(); SolrCore otherCore = cores.getCore(otherCore); It seems to work with some little testing. But is it a recommended approach? Pascal Dimassimo wrote: Hi, I've designed a front handler that will send request to other handlers and return a aggregated response. Inside this handler, I call other handlers like this (inside the method handleRequestBody): SolrCore core = req.getCore(); SolrRequestHandler mlt = core.getRequestHandler(/mlt); ModifiableSolrParams params = new ModifiableSolrParams(req.getParams()); params.set(mlt.fl, nFullText); req.setParams(params); mlt.handleRequest(req, rsp); First question: is this the recommended way to call another handler? Second question: how could I call a handler of another core? -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22483357.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Programmatic-access-to-other-handlers-tp22477731p22486235.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Programmatic access to other handlers
: I implement the interface and it returns me the current core. But how is it : different from doing request.getCore() from handleRequestBody()? And I don't i think ryan missunderstood your goal .. that's just a way for you to get access to your core prior to handling requests. : see how this can give me access to other cores. I think that what I need is : to get access to an instance of CoreContainer, so I can call getCore(name) : and getAdminCore to manage the different cores. So I'm wondering if this is : a good way to get that instance: I'm not positive, but i think the code you listed will actaully reconstruct new copies of all of the cores. the simplest way to get access to the CoreContainer is via the CoreDescriptor... yourCore.getCoreDescriptor().getCoreContainer().getCore(otherCoreName); (note i've never actually done this, it's just waht i remember off the top of my head from the past multicore design discussions ... the class/method names may be slightly wrong) -Hoss
Issues with stale searchers.
I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue where searchers are opened up right away when tomcat starts, and never goes away. This is causing read locks on the Lucene index holding open deleted files during merges. This causes our server to run out of disk space in our index. Wondering what is causing this issue as I have been searching for two days without any real answers. Thanks, LSOF Output java 7322 tomcat 70r REG 253,0 2569538 2883610 /opt/solr/data/index/_m5n.cfs (deleted) java 7322 tomcat 71r REG 253,0 2338291 2883609 /opt/solr/data/index/_m5m.cfs (deleted) java 7322 tomcat 72r REG 253,0 13398930 2883608 /opt/solr/data/index/_m5l.cfs (deleted) java 7322 tomcat 73r REG 253,0 2692917 2883598 /opt/solr/data/index/_m5k.cfs (deleted) java 7322 tomcat 74r REG 253,0 32324600 2883592 /opt/solr/data/index/_m5j.cfx (deleted) java 7322 tomcat 75r REG 253,0 6767344 2883603 /opt/solr/data/index/_m5j.cfs (deleted) java 7322 tomcat 76r REG 253,0 32324600 2883592 /opt/solr/data/index/_m5j.cfx (deleted) java 7322 tomcat 77r REG 253,0 15937346 2883600 /opt/solr/data/index/_m5i.cfs (deleted) Stats page on Solr Admin searc...@66952905 main class: org.apache.solr.search.SolrIndexSearcher version:1.0 description:index searcher stats: searcherName : searc...@66952905 main caching : true numDocs : 187169908 maxDoc : 187169908 readerImpl : ReadOnlyMultiSegmentReader readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index indexVersion : 1224609883675 openedAt : Thu Mar 12 17:13:15 CDT 2009 registeredAt : Thu Mar 12 17:13:23 CDT 2009 warmupTime : 0 name: core class: version:1.0 description:SolrCore stats: coreName : startTime : Thu Mar 12 17:13:15 CDT 2009 refCount : 2 aliases : [] name: searcher class: org.apache.solr.search.SolrIndexSearcher version:1.0 description:index searcher stats: searcherName : searc...@66952905 main caching : true numDocs : 187169908 maxDoc : 187169908 readerImpl : ReadOnlyMultiSegmentReader readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index indexVersion : 1224609883675 openedAt : Thu Mar 12 17:13:15 CDT 2009 registeredAt : Thu Mar 12 17:13:23 CDT 2009 warmupTime : 0 Jeremy Carroll Sr. Network Engineer Networked Insights
Re: DIH use of the ?command=full-import entity= command option
Wouldn't an entity be something such as a stream, or DB, a manifest- channel? The name source would be better to me but... there's the sQL data- sources. paul Le 12-mars-09 à 22:47, Fergus McMenemie a écrit : Can anybody describe the intended purpose, or provide a few examples, of how the DIH entity= command option works. Am I supposed to build a data-conf.xml file which contains many different alternate entities.. or smime.p7s Description: S/MIME cryptographic signature
Re: OR/NOT query syntax
I might be wrong on this, but since you can't do a query that's just a NOT statement, this wouldn't work either. I believe the NOT must negate results of a query, not the entire dataset. On Wed, Mar 11, 2009 at 6:56 PM, Andrew Wall rew...@gmail.com wrote: I'm attempting to write a solr query that ensures that if one field has a particular value that another field also have a particular value. I've arrived at this syntax, but it doesn't seem to work correctly. ((myField:superneat AND myOtherField:somethingElse) OR NOT myField:superneat) either operand functions correctly on its own - but not when joined together with the or not condition. I don't understand why this syntax doesn't work - can someone shed some light on this? Thanks! Andrew Wall -- Jonathan Haddad http://www.rustyrazorblade.com
Re: OR/NOT query syntax
On Wed, Mar 11, 2009 at 9:56 PM, Andrew Wall rew...@gmail.com wrote: I'm attempting to write a solr query that ensures that if one field has a particular value that another field also have a particular value. I've arrived at this syntax, but it doesn't seem to work correctly. ((myField:superneat AND myOtherField:somethingElse) OR NOT myField:superneat) Try (myField:superneat AND myOtherField:somethingElse) OR (*:* -myField:superneat) -Yonik http://www.lucidimagination.com
RE: Issues with stale searchers.
If that's the case it is causing out of disk issues with Solr. We have a 187m document count index which is about ~200Gb in size. Over a period of about a week after optimizations, etc... the open file but deleted count grows very large. Causing the system to not be able to optimize due to lack of disk space. Also new documents that are indexed are not showing up in search results. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, March 12, 2009 7:43 PM To: solr-user@lucene.apache.org Subject: Re: Issues with stale searchers. On Thu, Mar 12, 2009 at 6:29 PM, Jeremy Carroll jeremy.carr...@networkedinsights.com wrote: I have Solr 1.3 running on Apache Tomcat 5.5.27. I'm running into an issue where searchers are opened up right away when tomcat starts, and never goes away. This is causing read locks on the Lucene index holding open deleted files during merges. Deleted files being held open can be normal - that's the current IndexSearcher serving requests (even though those files may have been deleted by the IndexWriter already). Looking at your Stats, I only see one Searcher, so things look fine there too. -Yonik http://www.lucidimagination.com This causes our server to run out of disk space in our index. Wondering what is causing this issue as I have been searching for two days without any real answers. Thanks, LSOF Output java 7322 tomcat 70r REG 253,0 2569538 2883610 /opt/solr/data/index/_m5n.cfs (deleted) java 7322 tomcat 71r REG 253,0 2338291 2883609 /opt/solr/data/index/_m5m.cfs (deleted) java 7322 tomcat 72r REG 253,0 13398930 2883608 /opt/solr/data/index/_m5l.cfs (deleted) java 7322 tomcat 73r REG 253,0 2692917 2883598 /opt/solr/data/index/_m5k.cfs (deleted) java 7322 tomcat 74r REG 253,0 32324600 2883592 /opt/solr/data/index/_m5j.cfx (deleted) java 7322 tomcat 75r REG 253,0 6767344 2883603 /opt/solr/data/index/_m5j.cfs (deleted) java 7322 tomcat 76r REG 253,0 32324600 2883592 /opt/solr/data/index/_m5j.cfx (deleted) java 7322 tomcat 77r REG 253,0 15937346 2883600 /opt/solr/data/index/_m5i.cfs (deleted) Stats page on Solr Admin searc...@66952905 main class: org.apache.solr.search.SolrIndexSearcher version: 1.0 description: index searcher stats: searcherName : searc...@66952905 main caching : true numDocs : 187169908 maxDoc : 187169908 readerImpl : ReadOnlyMultiSegmentReader readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index indexVersion : 1224609883675 openedAt : Thu Mar 12 17:13:15 CDT 2009 registeredAt : Thu Mar 12 17:13:23 CDT 2009 warmupTime : 0 name: core class: version: 1.0 description: SolrCore stats: coreName : startTime : Thu Mar 12 17:13:15 CDT 2009 refCount : 2 aliases : [] name: searcher class: org.apache.solr.search.SolrIndexSearcher version: 1.0 description: index searcher stats: searcherName : searc...@66952905 main caching : true numDocs : 187169908 maxDoc : 187169908 readerImpl : ReadOnlyMultiSegmentReader readerDir : org.apache.lucene.store.FSDirectory@/opt/solr/data/index indexVersion : 1224609883675 openedAt : Thu Mar 12 17:13:15 CDT 2009 registeredAt : Thu Mar 12 17:13:23 CDT 2009 warmupTime : 0 Jeremy Carroll Sr. Network Engineer Networked Insights
SolrJ : EmbeddedSolrServer and database data indexing
Is it possible to index DB data directly to solr using EmbeddedSolrServer. I tried using data-Config File and Full-import commad, it works. So assuming using CommonsHttpServer will also work. But can I do it with EmbeddedSolrServer?? Thanks in advance... Ashish -- View this message in context: http://www.nabble.com/SolrJ-%3A-EmbeddedSolrServer-and-database-data-indexing-tp22488697p22488697.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issues with stale searchers.
On Thu, Mar 12, 2009 at 9:38 PM, Jeremy Carroll jeremy.carr...@networkedinsights.com wrote: If that's the case it is causing out of disk issues with Solr. We have a 187m document count index which is about ~200Gb in size. Over a period of about a week after optimizations, etc... the open file but deleted count grows very large. Causing the system to not be able to optimize due to lack of disk space. Also new documents that are indexed are not showing up in search results. Multiply the index size by 3 to get the max disk space. - 1 for the index currently open for searching - up to 1 for new segments written by the index writer (including merges) - up to 1 when the index writer does major merges or optimizes (the index writer can't delete the old segment files until it's sure that the new index has been written successfully). That said, what you are seeing could be normal, or could be a bug. -Yonik
Re: SolrJ : EmbeddedSolrServer and database data indexing
Is there any api in SolrJ that calls the dataImportHandler to execute commands like full-import and delta-import. Please help.. Ashish P wrote: Is it possible to index DB data directly to solr using EmbeddedSolrServer. I tried using data-Config File and Full-import commad, it works. So assuming using CommonsHttpServer will also work. But can I do it with EmbeddedSolrServer?? Thanks in advance... Ashish -- View this message in context: http://www.nabble.com/SolrJ-%3A-EmbeddedSolrServer-and-database-data-indexing-tp22488697p22489420.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DIH use of the ?command=full-import entity= command option
On Fri, Mar 13, 2009 at 3:17 AM, Fergus McMenemie fer...@twig.me.uk wrote: Hello, Can anybody describe the intended purpose, or provide a few examples, of how the DIH entity= command option works. Am I supposed to build a data-conf.xml file which contains many different alternate entities.. or With the entity parameter you can specify the name of any root entity and import only that one. You can specify multiple entity parameters too. For example: /dataimport?command=full-importentity=xentity=y You may need to specify preImportDeleteQuery separately on each entity to make sure all documents are not deleted. -- Regards, Shalin Shekhar Mangar.
Re: DIH use of the ?command=full-import entity= command option
If my data-config.xml contains multiple root level entities what is the expected action if I call full-import without an entity=XXX sub-command? Does it process all entities one after the other or only the first? (It would be useful IMHO if it only did the first.) On Fri, Mar 13, 2009 at 3:17 AM, Fergus McMenemie fer...@twig.me.uk wrote: Hello, Can anybody describe the intended purpose, or provide a few examples, of how the DIH entity= command option works. Am I supposed to build a data-conf.xml file which contains many different alternate entities.. or With the entity parameter you can specify the name of any root entity and import only that one. You can specify multiple entity parameters too. For example: /dataimport?command=full-importentity=xentity=y You may need to specify preImportDeleteQuery separately on each entity to make sure all documents are not deleted. -- Regards, Shalin Shekhar Mangar. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: CJKAnalyzer and Chinese Text sort
Thanks Hoss for your comments! I don't mind submitting it as a patch, shall I create a issue in Jira and submit the patch with that? Also, I didn't modify the core solr for locale based sorting; I just added the created a jar file with the class file copied it over to the lib folder. As part of the patch, shall I add it to the core solr code-base (users who want to use this don't need anything extra to do) or add it as a contrib field (they need to compile it as jar and copy it over to the lib folder)? Thanks! -- Original Message -- From: Chris Hostetter hossman_luc...@fucit.org To: solr-user@lucene.apache.org Subject: Re: CJKAnalyzer and Chinese Text sort Date: Wed, 11 Mar 2009 15:50:40 -0700 (PDT) First off: you can't sort on a field where any doc has more then one token -- that's why worting on a TextField doesn't work unless you use something like the KeywordTokenizer. Second... : I found out that reason the strings are not getting sorted is because : there is no way to pass the locale information to StrField, I ended up : extending StrField to take an additional attribute in schema.xml and : then had to override the getSortString method where in I create a new : Locale based on the schema attribute and pass it to the StrField. I put : this newly created jar file in the lib folder and everything seems to be : working fine after that. Since, my java knowledge is almost zilch, I was : wondering is this the right way to solve this problem or is there any : other recommended approach for this? I don't remember what the state of Locale-based sorting is, but the modifications you describe sound right based on what i remember ... would you be interested in submitting them back as a patch? http://wiki.apache.org/solr/HowToContribute -Hoss Be there without being there. Click now for great video conferencing solutions! http://thirdpartyoffers.netzero.net/TGL2231/fc/BLSrjnxPnB4hOQVqoEYkOC4tiqZzd7wrCMz9gjPk2mJcEaQiXNZxDIlo7b6/
Re: DIH use of the ?command=full-import entity= command option
On Fri, Mar 13, 2009 at 10:44 AM, Fergus McMenemie fer...@twig.me.ukwrote: If my data-config.xml contains multiple root level entities what is the expected action if I call full-import without an entity=XXX sub-command? Does it process all entities one after the other or only the first? (It would be useful IMHO if it only did the first.) It processes all entities one after the other. If you want to import only one, use the entity parameter. -- Regards, Shalin Shekhar Mangar.