Re: Solr 3.3 Sorting is not working for long fields
All, On Tue, Nov 15, 2011 at 1:21 PM, kashif.khan uplink2...@gmail.com wrote: Obviously there is some problem somewhere in the schema or any other files. the default SOLR demo which is by using the start.jar works well with the long field. It is just that we do not know where is the problem causing this error. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-3-Sorting-is-not-working-for-long-fields-tp3499366p3508947.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 3.3 Sorting is not working for long fields
stored=true/ dynamicField name=*_p type=location indexed=true stored=true/ !-- some trie-coded dynamic fields for faster range queries -- dynamicField name=*_ti type=tintindexed=true stored=true/ dynamicField name=*_tl type=tlong indexed=true stored=true/ dynamicField name=*_tf type=tfloat indexed=true stored=true/ dynamicField name=*_td type=tdouble indexed=true stored=true/ dynamicField name=*_tdt type=tdate indexed=true stored=true/ dynamicField name=*_pi type=pintindexed=true stored=true/ dynamicField name=ignored_* type=ignored multiValued=true/ dynamicField name=attr_* type=text_general indexed=true stored=true multiValued=true/ dynamicField name=random_* type=random / !-- uncomment the following to ignore any fields that don't already match an existing field name or dynamic field, rather than reporting them as an error. alternately, change the type=ignored to some other type e.g. text if you want unknown fields indexed and/or stored by default -- !--dynamicField name=* type=ignored multiValued=true /-- /fields !-- Field to use to determine and enforce document uniqueness. Unless this field is marked with required=false, it will be a required field -- uniqueKeyID/uniqueKey !-- field for the QueryParser to use when an explicit fieldname is absent -- defaultSearchFieldtext/defaultSearchField !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=OR/ !-- copyField commands copy one field to another at the time a document is added to the index. It's used either to index the same field differently, or to add multiple fields to the same field for easier/faster searching. -- copyField source=cat dest=text/ copyField source=name dest=text/ copyField source=manu dest=text/ copyField source=features dest=text/ copyField source=includes dest=text/ copyField source=manu dest=manu_exact/ !-- Above, multiple source fields are copied to the [text] field. Another way to map multiple source fields to the same destination field is to use the dynamic field syntax. copyField also supports a maxChars to copy setting. -- !-- copyField source=*_t dest=text maxChars=3000/ -- !-- copy name to alphaNameSort, a field designed for sorting by name -- !-- copyField source=name dest=alphaNameSort/ -- !-- Similarity is the scoring routine for each document vs. a query. A custom similarity may be specified here, but the default is fine for most applications. -- !-- similarity class=org.apache.lucene.search.DefaultSimilarity/ -- !-- ... OR ... Specify a SimilarityFactory class name implementation allowing parameters to be used. -- !-- similarity class=com.example.solr.CustomSimilarityFactory str name=paramkeyparam value/str /similarity -- /schema On Tue, Nov 15, 2011 at 2:53 PM, rajini maski rajinima...@gmail.com wrote: All, On Tue, Nov 15, 2011 at 1:21 PM, kashif.khan uplink2...@gmail.com wrote: Obviously there is some problem somewhere in the schema or any other files. the default SOLR demo which is by using the start.jar works well with the long field. It is just that we do not know where is the problem causing this error. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-3-Sorting-is-not-working-for-long-fields-tp3499366p3508947.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 3.3 Sorting is not working for long fields
Thankyou for the responses :) Found that the bug was in naming convention of fields. (for tlong/long ) I had given a number character as a name of the field. Studyid field name was - 450 , Changed it to S450 and it started working :) Thank you all. Regards, Rajani On Tue, Nov 15, 2011 at 3:28 PM, Michael Kuhlmann k...@solarier.de wrote: Hi, Am 15.11.2011 10:25, schrieb rajini maski: fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ [...] fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ [...] field name=studyid type=long indexed=true stored=true/ Hmh, why didn't you just changed the field type to tlong as you mentioned before? Instead you changed the class of the long type. There's nothing against this, it's just a bit confusing since long fields normally are of type solr.LongField, which is not sortable on its own. You specified a precisionStep of 0, which means that the field would be slow in range queries, but it shouldn't harm for sorting. All in all, it should work. So, the only chance I see is to re-index once again (and commit after that). I don't really see an error in your config except the confusing long type. It should work after reindexing, and it can't work if it was indexed with a genuine long type. -Kuli
Re: Solr 3.3 Sorting is not working for long fields
Field type is long and not multi valued. Using solr 3.3 war file , Tried on solr 1.4.1 index and solr 3.3 index , both cases its not working. query : http://localhost:8091/Group/select?/indent=onq=studyid:120sort=studyidasc,groupid asc,subjectid ascstart=0rows=10 all the ID fields are long Thanks Regards Rajani On Sun, Nov 13, 2011 at 7:58 AM, Erick Erickson erickerick...@gmail.comwrote: Well, 3.3 has been around for quite a while, I'd suspect that something this fundamental would have been found... Is your field multi-valued? And what kind of field is studyid? You really have to provide more details, input, output, etc to get reasonable help. It might help to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Nov 11, 2011 at 5:52 AM, rajini maski rajinima...@gmail.com wrote: Hi, I have upgraded my Solr from 1.4.1 to 3.3.Now I tried to sort on a long field and documents are not getting sorted based on that. Sort is working when we do sorting on facet ex:facet=on facet.sort=studyid But when do simple sort on documents , sort=studyid, sort doesn't happen. Is there any bug ? Regards, Rajani
Re: Solr 3.3 Sorting is not working for long fields
There is no error as such. When I do a basic sort on *long *field. the sort doesn't happen. Query is : -http://blr-ws-195:8091/Solr3.3/select/?q=2%3A104+AND+526%3A27747version=2.2start=0rows=10indent=onsort=469%20ascfl=469# lst name=*responseHeader* int name=*status*0/int int name=*QTime*3/int -http://blr-ws-195:8091/Solr3.3/select/?q=2%3A104+AND+526%3A27747version=2.2start=0rows=10indent=onsort=469%20ascfl=469# lst name=*params* str name=*fl*studyid/str str name=*sort*studyid asc/str str name=*indent*on/str str name=*start*0/str str name=*q**:*/str str name=*rows*100/str str name=*version*2.2/str /lst /lst response - result name=response numFound=216 start=0 - doc long name=studyid53/long /doc - doc long name=studyid18/int /doc - doc long name=studyid14/long /doc - doc int name=studyid11/long /doc - doc long name=studyid7/long /doc - doc int name=studyid63/int /doc - doc int name=studyid35/long /doc - doc int name=studyid70/long /doc - doc long name=studyid91/long /doc - doc int name=studyid97/int /doc /result /response The same case works with Solr1.4.1 but it is not working solr 3.3 Regards, Rajani On Mon, Nov 14, 2011 at 2:23 PM, Michael Kuhlmann k...@solarier.de wrote: Am 14.11.2011 09:33, schrieb rajini maski: query : http://localhost:8091/Group/**select?/indent=onq=studyid:** 120sort=studyidasc,groupidhttp://localhost:8091/Group/select?/indent=onq=studyid:120sort=studyidasc,groupid asc,subjectid ascstart=0rows=10 Is it a copy-and-paste error, or did you realls sort on studyidasc? I don't think you have a field studyidasc, and Solr should've given an exception that either asc or desc is missing. -Kuli
Re: Solr 3.3 Sorting is not working for long fields
I On Mon, Nov 14, 2011 at 7:23 PM, Ahmet Arslan iori...@yahoo.com wrote: When I do a basic sort on *long *field. the sort doesn't happen. Query is : - http://blr-ws-195:8091/Solr3.3/select/?q=2%3A104+AND+526%3A27747version=2.2start=0rows=10indent=onsort=469%20ascfl=469# lst name=*responseHeader* int name=*status*0/int int name=*QTime*3/int - http://blr-ws-195:8091/Solr3.3/select/?q=2%3A104+AND+526%3A27747version=2.2start=0rows=10indent=onsort=469%20ascfl=469# lst name=*params* str name=*fl*studyid/str str name=*sort*studyid asc/str str name=*indent*on/str str name=*start*0/str str name=*q**:*/str str name=*rows*100/str str name=*version*2.2/str /lst /lst response - result name=response numFound=216 start=0 - doc long name=studyid53/long /doc - doc long name=studyid18/int /doc - doc long name=studyid14/long /doc - doc int name=studyid11/long /doc - doc long name=studyid7/long /doc - doc int name=studyid63/int /doc - doc int name=studyid35/long /doc - doc int name=studyid70/long /doc - doc long name=studyid91/long /doc - doc int name=studyid97/int /doc /result /response The same case works with Solr1.4.1 but it is not working solr 3.3 Can you try with the following type? fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ And studyid must be marked as indexed=true. I tried this one. fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ It didn't work :( Sort didn't happen
Re: Solr 3.3 Sorting is not working for long fields
Yes . On 11/14/11, Ahmet Arslan iori...@yahoo.com wrote: I tried this one. fieldType name=tlong class=solr.TrieLongField precisionStep=8 omitNorms=true positionIncrementGap=0/ It didn't work :( Sort didn't happen Did you restart tomcat and perform re-index?
Solr 3.3 Sorting is not working for long fields
Hi, I have upgraded my Solr from 1.4.1 to 3.3.Now I tried to sort on a long field and documents are not getting sorted based on that. Sort is working when we do sorting on facet ex:facet=on facet.sort=studyid But when do simple sort on documents , sort=studyid, sort doesn't happen. Is there any bug ? Regards, Rajani
Re: Query on multi valued field
Thank you. This logic works for me. Thanks a lot. Regards, Rajani Maski On Wed, Aug 3, 2011 at 1:21 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : The query is get only those documents which have multiple elements for : that multivalued field. : : I.e, doc 2 and 3 should be returned from the above set.. The only way to do something like this is to add a field when you index your documents that contains the number and then filter on that field using a range query. With an UpdateProcessor (or a ScriptTransformer in DIH) you can automate counting how many values there are -- but it has to be indexed to search/filter on it. -Hoss
Query on multi valued field
Hi All, I have a specific requirement in the multi-valued field type.The requirement is as follows There is a multivalued field in each document which can have mutliple elements or single element. For Eg: Consider that following are the documents matched for say q= *:* *DOC1* doc arr name=multi str1/str /arr /doc * * *DOC2* doc arr name=multi str1/str str3/str str4/str /arr /doc *DOC3* doc arr name=multi str1/str str2/str /arr /doc The query is get only those documents which have multiple elements for that multivalued field. I.e, doc 2 and 3 should be returned from the above set.. Is there anyway to achieve this? Awaiting reply, Thanks Regards, Rajani
Token Factory attribute in filter tag
How does this attribute token factory within filter work? In this link [click here],http://www.mail-archive.com/solr-dev@lucene.apache.org/msg05751.htmlThere is the usage of token factory in the synonym filter tag. Here I see the white space token at index time then a synonym filter followed by white space token factory. What is role of white space token factory in this case? Say for example if i have data at index time is : Tourist place xyz Infant Jesus church Coles park Cafe coffee Day Synoyms: Hang out, Outing, Tourist place Cafe Coffee Day, CCD,Cafe shop How does the 2 tokens together plays role of splitting the above data . Please anyone explain me for one of the example followed: When user search Hang out, how does the data gets split to match to synonym in list. {any appropriate link related to this is also fine} Thanks Awaiting reply Rajani
Re: Query on Synonyms feature in Solr
Erick: I have tried what you said. I needed clarification on this.. Below is my doubt added: Say If i have field type : fieldType name=Synonymdata class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=org.apache.solr.orchsynonym.OrchSynonymFilter synonyms=BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=org.apache.solr.orchsynonym.OrchSynonymFilter synonyms=BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType The data indexed in this field is : sentence 1 : tissue devitalization was noted in hepalocytes of liver sentence 2 : Necrosis not found in liver Synonyms: necrosis , tissue devitalization, cellular necrosis How does the white space and synonym filter behave?I am not able to understand in analysis page..Please let me know if it is like this that works? Correct me if i am wrong.. sentence 1 : tissue devitalization was noted in hepalocytes of liver white space : tissue devitalization was noted in hepalocytes of liver Synoyms for token words: No synonyms for tissue , no synonym for devitalization and so on. So does the tissue devitalization word will not become synonym for Necrosis ?(since it is mentioned in synonym) If it adds as the synonym, Then how is it splitting the sentence and adding the filter? Which is happening first? Sentence 2: Necrosis not found in liver white space Necrosis not found in liver Synoyms for token words: synonyms for Necrosis: tissue devitalization,cellular necrosis, no synonym for not, no synonym for found and so on. Is this correct? My main concern is when i have 3 set of data like this: tissue devitalization was observed in hepalocytes of liver necrosis was observed in liver Necrosis not found in liver When i search Necrosis not found I need to get only the last sentence. I am not able to find out the list of tokens and analysers that i need to apply in order to acheieve this desired output Awaiting reply Rajani Maski On Tue, Jun 14, 2011 at 3:13 PM, roySolr royrutten1...@gmail.com wrote: Maybe you can try to escape the synonyms so it's no tokized by whitespace.. Private\ schools,NGO\ Schools,Unaided\ schools -- View this message in context: http://lucene.472066.n3.nabble.com/Query-on-Synonyms-feature-in-Solr-tp3058197p3062392.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query on Synonyms feature in Solr
than On Wed, Jun 15, 2011 at 9:42 PM, Erick Erickson erickerick...@gmail.comwrote: Well, first it is usually unnecessary to specify the synonym filter both at index and query time, I'd apply it only at query time to start, then perhaps switch to index time, see the discussion at: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46 for why index-time is preferable. Note you'll have to re-index. That said, essentially what happens (and assuming synonym filter is only in the query part) is you have something like this as your search for necrosis not found. Offset 0 offset1 offset 2 necrosis tissue devitalizationnotfound cellular necrosis Note that one of your three synonyms must appear in position 0, followed by the other two terms. So your example should just work. But as I said, it would probably be best if you put your synonym filter only in at index or query time. An analogous process happens if you add synonyms at index time. Best Erick On Wed, Jun 15, 2011 at 8:14 AM, rajini maski rajinima...@gmail.com wrote: Erick: I have tried what you said. I needed clarification on this.. Below is my doubt added: Say If i have field type : fieldType name=Synonymdata class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=org.apache.solr.orchsynonym.OrchSynonymFilter synonyms=BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=org.apache.solr.orchsynonym.OrchSynonymFilter synonyms=BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType The data indexed in this field is : sentence 1 : tissue devitalization was noted in hepalocytes of liver sentence 2 : Necrosis not found in liver Synonyms: necrosis , tissue devitalization, cellular necrosis How does the white space and synonym filter behave?I am not able to understand in analysis page..Please let me know if it is like this that works? Correct me if i am wrong.. sentence 1 : tissue devitalization was noted in hepalocytes of liver white space : tissue devitalization was noted in hepalocytes of liver Synoyms for token words: No synonyms for tissue , no synonym for devitalization and so on. So does the tissue devitalization word will not become synonym for Necrosis ?(since it is mentioned in synonym) If it adds as the synonym, Then how is it splitting the sentence and adding the filter? Which is happening first? Sentence 2: Necrosis not found in liver white space Necrosis not found in liver Synoyms for token words: synonyms for Necrosis: tissue devitalization,cellular necrosis, no synonym for not, no synonym for found and so on. Is this correct? My main concern is when i have 3 set of data like this: tissue devitalization was observed in hepalocytes of liver necrosis was observed in liver Necrosis not found in liver When i search Necrosis not found I need to get only the last sentence. I am not able to find out the list of tokens and analysers that i need to apply in order to acheieve this desired output Awaiting reply Rajani Maski On Tue, Jun 14, 2011 at 3:13 PM, roySolr royrutten1...@gmail.com wrote: Maybe you can try to escape the synonyms so it's no tokized by whitespace.. Private\ schools,NGO\ Schools,Unaided\ schools -- View this message in context: http://lucene.472066.n3.nabble.com/Query-on-Synonyms-feature-in-Solr-tp3058197p3062392.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query on Synonyms feature in Solr
ok. Thank you. I will consider this. One last doubt ,how do i handle negation terms? In the above mail as i mentioned, If i have 3 sentence like this: 1 .tissue devitalization was observed in hepalocytes of liver 2. necrosis was observed in liver 3. Necrosis not found in liver When i search Necrosis not found I need to get only the last sentence. but now i get all the 3 results. I am not able to find out the list of tokens and analysers that i need to apply in order to acheieve this desired output Awaiting reply Rajani Maski As explained in the above mail, On Wed, Jun 15, 2011 at 9:42 PM, Erick Erickson erickerick...@gmail.comwrote: Well, first it is usually unnecessary to specify the synonym filter both at index and query time, I'd apply it only at query time to start, then perhaps switch to index time, see the discussion at: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46 for why index-time is preferable. Note you'll have to re-index. That said, essentially what happens (and assuming synonym filter is only in the query part) is you have something like this as your search for necrosis not found. Offset 0 offset1 offset 2 necrosis tissue devitalizationnotfound cellular necrosis Note that one of your three synonyms must appear in position 0, followed by the other two terms. So your example should just work. But as I said, it would probably be best if you put your synonym filter only in at index or query time. An analogous process happens if you add synonyms at index time. Best Erick On Wed, Jun 15, 2011 at 8:14 AM, rajini maski rajinima...@gmail.com wrote: Erick: I have tried what you said. I needed clarification on this.. Below is my doubt added: Say If i have field type : fieldType name=Synonymdata class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=org.apache.solr.orchsynonym.OrchSynonymFilter synonyms=BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=org.apache.solr.orchsynonym.OrchSynonymFilter synonyms=BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType The data indexed in this field is : sentence 1 : tissue devitalization was noted in hepalocytes of liver sentence 2 : Necrosis not found in liver Synonyms: necrosis , tissue devitalization, cellular necrosis How does the white space and synonym filter behave?I am not able to understand in analysis page..Please let me know if it is like this that works? Correct me if i am wrong.. sentence 1 : tissue devitalization was noted in hepalocytes of liver white space : tissue devitalization was noted in hepalocytes of liver Synoyms for token words: No synonyms for tissue , no synonym for devitalization and so on. So does the tissue devitalization word will not become synonym for Necrosis ?(since it is mentioned in synonym) If it adds as the synonym, Then how is it splitting the sentence and adding the filter? Which is happening first? Sentence 2: Necrosis not found in liver white space Necrosis not found in liver Synoyms for token words: synonyms for Necrosis: tissue devitalization,cellular necrosis, no synonym for not, no synonym for found and so on. Is this correct? My main concern is when i have 3 set of data like this: tissue devitalization was observed in hepalocytes of liver necrosis was observed in liver Necrosis not found in liver When i search Necrosis not found I need to get only the last sentence. I am not able to find out the list of tokens and analysers that i need to apply in order to acheieve this desired output Awaiting reply Rajani Maski On Tue, Jun 14, 2011 at 3:13 PM, roySolr royrutten1...@gmail.com wrote: Maybe you can try to escape the synonyms so it's no tokized by whitespace.. Private\ schools,NGO\ Schools,Unaided
Query on Synonyms feature in Solr
Synonyms feature to be enabled on documents in Solr. I have one field in solr that has the content of a document.( say field name : document_data). The data in that field is : Tamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday Synonyms for private school in synonym flat file are : Private schools,NGO Schools,Unaided schools Now when i search on this field as document_data=unaided schools. I need to get the results. What are the token, analyser filter that i can apply to the document_dataFIELD in order to get the results above This is the indexed document : add doc field name=IDSOLR200/field field name=document_dataTamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday/field /doc /add Right now i tried for these 2 fields type.. And i couldn't get the above results fieldType name=Synonym_document class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.SynonymFilter synonyms=Taxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType fieldType name=Synonym_document class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilter synonyms=Taxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=document_data type=Synonym_document indexed=true multiValued=true/ Both didn't work for my query. Anyone please guide me with the token, analyser filter that i can apply to the document_data FIELD in order to get the results above Regards, Rajani
Re: Query on Synonyms feature in Solr
Karsten, I have tried for both the cases you mentioned below. For WhitespaceTokenizerFactory that generates two tokens: private schools and so i don't get results as required. It will initially split private schools as private and schools and then try to match in synonym filter. This fails the match because my synonym flat file has list like this :Private schools,NGO Schools,Unaided schools So after split, it is trying to find synonym filter for private and not for Private Schools.This fails the match In case of KeywordTokenizerFactory, It takes the entire content in that field as one key word. eg: document_data = Tamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday is considered as one key word. But note that private school is just the part of that field or the part of the sentence in that field. And thus this will also not match our search :( Any other suggestions to fix this? Regards, Rajani Maski On Mon, Jun 13, 2011 at 4:54 PM, karsten-s...@gmx.de wrote: Hi rajini, multi-word synonyms like private schools normally make problems. See e.g. Solr-1-4-Enterprise-Search-Server Page 56: For multi-word synonyms to work, the analysis must be applied at index-time and with expansion so that both the original words and the combined word get indexed. ... Your problem: The input of Synonym Filter must be the exact !Token! Private schools. So WhitespaceTokenizerFactory generates two tokens: private schools and for KeywordTokenizerFactory the whole text is one token. Beste regards Karsten Original-Nachricht Datum: Mon, 13 Jun 2011 16:07:35 +0530 Von: rajini maski rajinima...@gmail.com An: solr-user@lucene.apache.org Betreff: Query on Synonyms feature in Solr Synonyms feature to be enabled on documents in Solr. I have one field in solr that has the content of a document.( say field name : document_data). The data in that field is : Tamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday Synonyms for private school in synonym flat file are : Private schools,NGO Schools,Unaided schools Now when i search on this field as document_data=unaided schools. I need to get the results. What are the token, analyser filter that i can apply to the document_dataFIELD in order to get the results above This is the indexed document : add doc field name=IDSOLR200/field field name=document_dataTamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday/field /doc /add Right now i tried for these 2 fields type.. And i couldn't get the above results fieldType name=Synonym_document class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.SynonymFilter synonyms=Taxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType fieldType name=Synonym_document class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilter synonyms=Taxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=document_data type=Synonym_document indexed=true multiValued=true/ Both didn't work for my query. Anyone please guide me with the token, analyser filter that i can apply to the document_data FIELD in order to get the results above Regards, Rajani
Code for getting distinct facet counts across shards(Distributed Process).
In solr 1.4.1, for getting distinct facet terms count across shards, The piece of code added for getting count of distinct facet terms across distributed process is as followed: Class: facetcomponent.java Function: -- finishStage(ResponseBuilder rb) for (DistribFieldFacet dff : fi.facets.values()) { //just after this line of code else { // TODO: log error or throw exception? counts = dff.getLexSorted(); int namedistint = 0; namedistint=rb.req.getParams().getFieldInt(dff.getKey().toString(),FacetParams.FACET_NAMEDISTINCT,0); if (namedistint == 0) facet_fields.add(dff.getKey(), fieldCounts); if (namedistint == 1) facet_fields.add(numfacetTerms, counts.length); if (namedistint == 2) { NamedList resCount = new NamedList(); resCount.add(numfacetTerms, counts.length); resCount.add(counts, fieldCounts); facet_fields.add(dff.getKey(), resCount); } Is this flow correct ? I have worked with few test cases and it has worked fine. but i want to know if there are any bugs that can creep in here? (My concern is this piece of code should not effect the rest of logic) *Code flow with comments for reference:* Function : -- finishStage(ResponseBuilder rb) //in this for loop , for (DistribFieldFacet dff : fi.facets.values()) { //just after this line of code else { // TODO: log error or throw exception? counts = dff.getLexSorted(); int namedistint = 0; //default //get the value of facet.numterms from the input query namedistint=rb.req.getParams().getFieldInt(dff.getKey().toString(),FacetParams.FACET_NAMEDISTINCT,0); // based on the value for facet.numterms==0 or 1 or 2 , if conditions //Get only facet field counts if (namedistint == 0) { facet_fields.add(dff.getKey(), fieldCounts); } //get only distinct facet term count if (namedistint == 1) { facet_fields.add(numfacetTerms, counts.length); } //get facet field count and distinct term count. if (namedistint == 2) { NamedList resCount = new NamedList(); resCount.add(numfacetTerms, counts.length); resCount.add(counts, fieldCounts); facet_fields.add(dff.getKey(), resCount); } Regards, Rajani On Fri, May 27, 2011 at 1:14 PM, rajini maski rajinima...@gmail.com wrote: No such issues . Successfully integrated with 1.4.1 and it works across single index. for f.2.facet.numFacetTerms=1 parameter it will give the distinct count result for f.2.facet.numFacetTerms=2 parameter it will give counts as well as results for facets. But this is working only across single index not distributed process. The conditions you have added in simple facet.java- if namedistinct count ==int ( 0, 1 and 2 condtions).. Should it be added in distributed process function to enable it work across shards? Rajani On Fri, May 27, 2011 at 12:33 PM, Bill Bell billnb...@gmail.com wrote: I am pretty sure it does not yet support distributed shards.. But the patch was written for 4.0... So there might be issues with running it on 1.4.1. On 5/26/11 11:08 PM, rajini maski rajinima...@gmail.com wrote: The patch solr 2242 for getting count of distinct facet terms doesn't work for distributedProcess (https://issues.apache.org/jira/browse/SOLR-2242) The error log says HTTP ERROR 500 Problem accessing /solr/select. Reason: For input string: numFacetTerms java.lang.NumberFormatException: For input string: numFacetTerms at java.lang.NumberFormatException.forInputString(NumberFormatException.java: 48) at java.lang.Long.parseLong(Long.java:403) at java.lang.Long.parseLong(Long.java:461) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344) at org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(Fac etComponent.java:619) at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponen t.java:265) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComp onent.java:235) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa ndler.java:290) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas e.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter
Re: Query regarding Solr-2242 patch for getting distinct facet counts.
In solr 1.4.1, for getting distinct facet terms count across shards, The piece of code added for getting count of distinct facet terms across distributed process is as followed: Class: facetcomponent.java Function: -- finishStage(ResponseBuilder rb) for (DistribFieldFacet dff : fi.facets.values()) { //just after this line of code else { // TODO: log error or throw exception? counts = dff.getLexSorted(); int namedistint = 0; namedistint=rb.req.getParams().getFieldInt(dff.getKey().toString(),FacetParams.FACET_NAMEDISTINCT,0); if (namedistint == 0) facet_fields.add(dff.getKey(), fieldCounts); if (namedistint == 1) facet_fields.add(numfacetTerms, counts.length); if (namedistint == 2) { NamedList resCount = new NamedList(); resCount.add(numfacetTerms, counts.length); resCount.add(counts, fieldCounts); facet_fields.add(dff.getKey(), resCount); } Is this flow correct ? I have worked with few test cases and it has worked fine. but i want to know if there are any bugs that can creep in here? (My concern is this piece of code should not effect the rest of logic) *Code flow with comments for reference:* Function : -- finishStage(ResponseBuilder rb) //in this for loop , for (DistribFieldFacet dff : fi.facets.values()) { //just after this line of code else { // TODO: log error or throw exception? counts = dff.getLexSorted(); int namedistint = 0; //default //get the value of facet.numterms from the input query namedistint=rb.req.getParams().getFieldInt(dff.getKey().toString(),FacetParams.FACET_NAMEDISTINCT,0); // based on the value for facet.numterms==0 or 1 or 2 , if conditions //Get only facet field counts if (namedistint == 0) { facet_fields.add(dff.getKey(), fieldCounts); } //get only distinct facet term count if (namedistint == 1) { facet_fields.add(numfacetTerms, counts.length); } //get facet field count and distinct term count. if (namedistint == 2) { NamedList resCount = new NamedList(); resCount.add(numfacetTerms, counts.length); resCount.add(counts, fieldCounts); facet_fields.add(dff.getKey(), resCount); } Regards, Rajani On Fri, May 27, 2011 at 1:14 PM, rajini maski rajinima...@gmail.com wrote: No such issues . Successfully integrated with 1.4.1 and it works across single index. for f.2.facet.numFacetTerms=1 parameter it will give the distinct count result for f.2.facet.numFacetTerms=2 parameter it will give counts as well as results for facets. But this is working only across single index not distributed process. The conditions you have added in simple facet.java- if namedistinct count ==int ( 0, 1 and 2 condtions).. Should it be added in distributed process function to enable it work across shards? Rajani On Fri, May 27, 2011 at 12:33 PM, Bill Bell billnb...@gmail.com wrote: I am pretty sure it does not yet support distributed shards.. But the patch was written for 4.0... So there might be issues with running it on 1.4.1. On 5/26/11 11:08 PM, rajini maski rajinima...@gmail.com wrote: The patch solr 2242 for getting count of distinct facet terms doesn't work for distributedProcess (https://issues.apache.org/jira/browse/SOLR-2242) The error log says HTTP ERROR 500 Problem accessing /solr/select. Reason: For input string: numFacetTerms java.lang.NumberFormatException: For input string: numFacetTerms at java.lang.NumberFormatException.forInputString(NumberFormatException.java: 48) at java.lang.Long.parseLong(Long.java:403) at java.lang.Long.parseLong(Long.java:461) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344) at org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(Fac etComponent.java:619) at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponen t.java:265) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComp onent.java:235) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa ndler.java:290) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas e.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter
Applying synonyms increase the data size from MB to GBs
Applying synonyms increased the data size from 28 mb to 10.3 gb Before enabling synonyms to the a field , the data size was 28mb. Now , after applying synonyms I see that data folder size has increased to 10.3 gb. Attached is schema field type for that field: fieldType name=textBODY class=solr.TextField positionIncrementGap=100 analyzer filter class=solr.SynonymFilterFactory synonyms=BODYTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=ObsTaxo.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=MTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=MicTaxo.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=SpTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=ParameterTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=STaxo.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType All the attached synonym files are not more than 200KB What might be the reason for this? Any config changes to be done? Regards Rajani
Re: Applying synonyms increase the data size from MB to GBs
I have the flat files (synonym text files) each upto 200kb. Integrationg all of them increased the txt file size to huge. And I wanted to maintain them separately. So in order to apply all those synonyms to same field type I created that many filter tags for respective synonym txt files. Is it not the right way to do so? Is there a way where in I can apply all those file to same tag with some delimiter separated? like this: fieldType name=textBODY class=solr.TextField positionIncrementGap=100 analyzer filter class=solr.SynonymFilterFactory synonyms=BODYTaxonomy.txt , ClinicalObs.txt, MicTaxo.txt, SPTaxo.txt ignoreCase=true expand=true/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType Rajani On Mon, Jun 6, 2011 at 11:01 AM, Gora Mohanty g...@mimirtech.com wrote: On Mon, Jun 6, 2011 at 10:34 AM, rajini maski rajinima...@gmail.com wrote: Applying synonyms increased the data size from 28 mb to 10.3 gb Before enabling synonyms to the a field , the data size was 28mb. Now , after applying synonyms I see that data folder size has increased to 10.3 gb. Attached is schema field type for that field: fieldType name=textBODY class=solr.TextField positionIncrementGap=100 analyzer filter class=solr.SynonymFilterFactory synonyms=BODYTaxonomy.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=ObsTaxo.txt ignoreCase=true expand=true/ filter class=solr.SynonymFilterFactory synonyms=MTaxonomy.txt ignoreCase=true expand=true/ [...] Could you explain what you are trying to do with multiple SynonymFilterFactory filters applied to the field? Regards, Gora
Re: Query regarding Solr-2242 patch for getting distinct facet counts.
No such issues . Successfully integrated with 1.4.1 and it works across single index. for f.2.facet.numFacetTerms=1 parameter it will give the distinct count result for f.2.facet.numFacetTerms=2 parameter it will give counts as well as results for facets. But this is working only across single index not distributed process. The conditions you have added in simple facet.java- if namedistinct count ==int ( 0, 1 and 2 condtions).. Should it be added in distributed process function to enable it work across shards? Rajani On Fri, May 27, 2011 at 12:33 PM, Bill Bell billnb...@gmail.com wrote: I am pretty sure it does not yet support distributed shards.. But the patch was written for 4.0... So there might be issues with running it on 1.4.1. On 5/26/11 11:08 PM, rajini maski rajinima...@gmail.com wrote: The patch solr 2242 for getting count of distinct facet terms doesn't work for distributedProcess (https://issues.apache.org/jira/browse/SOLR-2242) The error log says HTTP ERROR 500 Problem accessing /solr/select. Reason: For input string: numFacetTerms java.lang.NumberFormatException: For input string: numFacetTerms at java.lang.NumberFormatException.forInputString(NumberFormatException.java: 48) at java.lang.Long.parseLong(Long.java:403) at java.lang.Long.parseLong(Long.java:461) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344) at org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(Fac etComponent.java:619) at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponen t.java:265) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComp onent.java:235) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa ndler.java:290) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas e.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav a:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl er.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216 ) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnect ion.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:41 0) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:5 82) The query I passed : http://localhost:8983/solr/select?q=*:*facet=truefacet.field=2facet.fie ld=648facet.mincount=1facet.limit=-1f.2.facet.numFacetTerms=1rows=0sh ards=localhost:8983/solr,localhost:8985/solrtwo Anyone can suggest me the changes i need to make to enable the same funcionality for shards? When i do it across single core.. I get the correct results. I have applied the solr 2242 patch in solr1.4.1 Awaiting for reply Regards, Rajani
Query regarding Solr-2242 patch for getting distinct facet counts.
The patch solr 2242 for getting count of distinct facet terms doesn't work for distributedProcess (https://issues.apache.org/jira/browse/SOLR-2242) The error log says HTTP ERROR 500 Problem accessing /solr/select. Reason: For input string: numFacetTerms java.lang.NumberFormatException: For input string: numFacetTerms at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:403) at java.lang.Long.parseLong(Long.java:461) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344) at org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(FacetComponent.java:619) at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponent.java:265) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:235) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) The query I passed : http://localhost:8983/solr/select?q=*:*facet=truefacet.field=2facet.field=648facet.mincount=1facet.limit=-1f.2.facet.numFacetTerms=1rows=0shards=localhost:8983/solr,localhost:8985/solrtwo Anyone can suggest me the changes i need to make to enable the same funcionality for shards? When i do it across single core.. I get the correct results. I have applied the solr 2242 patch in solr1.4.1 Awaiting for reply Regards, Rajani
Re: Query on facet field¹s count
Sorry for the late reply to this thread. I implemented the same patch (solr 2242 )in Solr 1.4.1. Now I am able to get distinct facet terms count across single index. But this does not work for distributed process(sharding)..Is there a recent patch that has same functionality for distributed process? It works for the below query: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=StudyIDfacet.mincount=1facet.limit=-1f.StudyID.facet.namedistinct=1 It doesn't work for : http://localhost:8983/solr/select?q=*:*facet=truefacet.field=StudyIDfacet.mincount=1facet.limit=-1f.StudyID.facet.namedistinct=1 shards=localhost:8090/solr2 It gets matched result set from both the cores but facet results are only from first core. Rajani On Sat, Mar 12, 2011 at 10:35 AM, rajini maski rajinima...@gmail.comwrote: Thanks Bill Bell . .This query works after applying the patch you refered to, is it? Please can you let me know how do I need to update the current war (apache solr 1.4.1 )file with this new patch? Thanks a lot. Thanks, Rajani On Sat, Mar 12, 2011 at 8:56 AM, Bill Bell billnb...@gmail.com wrote: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=StudyIDface t.mincount=1facet.limit=-1f.StudyID.facet.namedistinct=1http://localhost:8983/solr/select?q=*:*facet=truefacet.field=StudyIDfacet.mincount=1facet.limit=-1f.StudyID.facet.namedistinct=1 Would do what you want I believe... On 3/11/11 8:51 AM, Bill Bell billnb...@gmail.com wrote: There is my patch to do that. SOLR-2242 Bill Bell Sent from mobile On Mar 11, 2011, at 1:34 AM, rajini maski rajinima...@gmail.com wrote: Query on facet field results... When I run a facet query on some field say : facet=on facet.field=StudyID I get list of distinct StudyID list with the count that tells that how many times did this study occur in the search query. But I also needed the count of these distinct StudyID list.. Any solr query to get count of it.. Example: lst name=*facet_fields* lst name= StudyID int name=*105*135164/int int name=*179*79820/int int name=*107*70815/int int name=*120*37076/int int name=*134*35276/int /lst /lst I wanted the count attribute that shall return the count of number of different studyID occurred .. In above example it could be : Count = 5 (105,179,107,120,134) lst name=*facet_fields* lst name= StudyID COUNT=5 int name=*105*135164/int int name=*179*79820/int int name=*107*70815/int int name=*120*37076/int int name=*134*35276/int /lst /lst
Re: Out of memory on sorting
Explicit Warming of Sort Fields If you do a lot of field based sorting, it is advantageous to add explicitly warming queries to the newSearcher and firstSearcher event listeners in your solrconfig which sort on those fields, so the FieldCache is populated prior to any queries being executed by your users. firstSearcher lst str name=qsolr rocks/strstr name=start0/strstr name=rows10/strstr name=sortempID asc/str/lst On Thu, May 19, 2011 at 2:39 PM, Rohit ro...@in-rev.com wrote: Hi, We are moving to a multi-core Solr installation with each of the core having millions of documents, also documents would be added to the index on an hourly basis. Everything seems to run find and I getting the expected result and performance, except where sorting is concerned. I have an index size of 13217121 documents, now when I want to get documents between two dates and then sort them by ID solr goes out of memory. This is with just me using the system, we might also have simultaneous users, how can I improve this performance? Rohit
Re: Facet filter: how to specify OR expression?
The input parameter assigning to the field tint is type string (or). It is trying to assign tint=or which is incorrect. So the respective exception has occurred. On Thu, May 12, 2011 at 4:10 PM, cnyee yeec...@gmail.com wrote: The exception says: java.lang.NumberFormatExcepton: for input string or The field type is: fieldType name=tint class=solr.TrieIntField precisionStep=8 omitNorms=true positionIncrementGap=0/ -- View this message in context: http://lucene.472066.n3.nabble.com/Facet-filter-how-to-specify-OR-expression-tp2930570p2931282.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching accross Solr-Multicore
If the schema is different across cores , you can query across the cores only for those fields that are common. Querying across all cores for some query paramterer and gettin result set in one output xml can be achieved by shards http://localhost:8090/solr1indent=onq=*:*shards=localhost:8090/solr1,localhost:8090/solr2rows=10start=0 Regards, Rajani On Mon, May 9, 2011 at 2:36 PM, Benyahya, Fahd fahd.benya...@netmoms.dewrote: Hi, sorry that I did not so well explained my issue. That is exactly as you described it(* Or, is it that queries are working on one core, and not on the other?) Regards, Fahd On 9 May 2011 10:58, Gora Mohanty g...@mimirtech.com wrote: On Mon, May 9, 2011 at 2:10 PM, Benyahya, Fahd fahd.benya...@netmoms.de wrote: Hallo everyone, i'm using solr-multicore with 3 cores to index my Web-Site. For testing i'm using the solr-admin GUI to get responses. The Problem is, that i get results only from one core, but not from the others also. [...] What do you mean by get results only from one core, but not from the others also? * Are you querying one core, and expecting to get results from all? This is not possible: You have to either query each, or merge them into a single core. * Or, is it that queries are working on one core, and not on the other? Regards, Gora
Does the Solr enable Lemmatization [not the Stemming]
Does the solr enable lemmatization concept? I found a documentation that gives an information as solr enables lemmatization concept. Here is the link : http://www.basistech.com/knowledge-center/search/2010-09-language-identification-language-support-and-entity-extraction.pdf Can anyone help me finding the jar specified in that document so that i can add it as plugin. jar :rlp.solr.RLPTokenizerFactory Thanks and Regards, Rajani Maski
Re: Query regarding solr plugin.
Erick, Thank you. I could fix the problem. Started from scratch considering your advice and been successful. Thanks a lot. Rajani Maski On Tue, Apr 26, 2011 at 5:28 PM, Erick Erickson erickerick...@gmail.comwrote: Sorry, but there's too much here to debug remotely. I strongly advise you back wy up. Undo (but save) all your changes. Start by doing the simplest thing you can, just get a dummy class in place and get it called. Perhaps create a really dumb logger method that opens a text file, writes a message, and closes the file. Inefficient I know, but this is just to find out the problem. Debugging by println is an ancient technique... Once you're certain the dummy class is called, gradually build it up to the complex class you eventually want. One problem here is that you've changed a bunch of moving parts, copied jars around (it's unclear whether you have two copies of solr-core in your classpath, for instance). So knowing exactly which one of those is the issue is very difficult, especially since you may have forgotten one of the things you did. I know when I've been trying to do something for days, lots of details get lost. Try to avoid changing the underlying Solr code, can you do what you want by subclassing instead and calling your new class? That would avoid a bunch of problems. If you can't subclass, copy the whole thing and rename it to something new and call *that* rather than re-use the synonymfilterfactory. The only jar you should copy to the lib directory would be the one you put your new class in. I can't emphasize strongly enough that you'll save yourself lots of grief if you start with a fresh install and build up gradually rather than try to unravel the current code. It feels wasteful, but winds up being faster in my experience... Good Luck! Erick On Tue, Apr 26, 2011 at 12:41 AM, rajini maski rajinima...@gmail.com wrote: Thanks Erick. I have added my replies to the points you did mention. I am somewhere going wrong. I guess do I need to club both the jars or something ? If yes, how do i do that? I have no much idea about java and jar files. Please guide me here. A couple of things to try. 1 when you do a 'jar -tfv yourjar, you should see output like: 1183 Sun Jun 06 01:31:14 EDT 2010 org/apache/lucene/analysis/sinks/TokenTypeSinkTokenizer.class and your filter statement may need the whole path, in this example... filter class=org.apache.lucene.analysis.sinks.TokenTypeSink/ (note, this is just an example of the pathing, this class has nothing to do with your filter)... I could see this output.. 2 But I'm guessing your path is actually OK, because I'd expect to be seeing a class not found error. So my guess is that your class depends on other jars that aren't packaged up in your jar and if you find which ones they are and copy them to your lib directory you'll be OK. Or your code is throwing an error on load. Or something like that... There is jar - apache-solr-core-1.4.1.jar this has the BaseTokenFilterFacotry class and the Synonymfilterfactory class..I made the changes in second class file and created it as new. Now i created a jar of that java file and placed this in solr home/lib and also placed apache-solr-core-1.4.1.jar file in lib folder of solr home. [solr home - c:\orch\search\solr lib path - c:\orch\search\solr\lib] 3 to try to understand what's up, I'd back up a step. Make a really stupid class that doesn't do anything except derive from BaseTokenFilterFacotry and see if you can load that. If you can, then your process is OK and you need to find out what classes your new filter depend on. If you still can't, then we can see what else we can come up with.. I am perhaps doing same. In the synonymfilterfactory class, there is a function parse rules which takes delimiters as one of the input parameter. Here i changed comma ',' to '~' tilde symbol and thats it. Regards, Rajani On Mon, Apr 25, 2011 at 6:26 PM, Erick Erickson erickerick...@gmail.com wrote: Looking at things more carefully, it may be one of your dependent classes that's not being found. A couple of things to try. 1 when you do a 'jar -tfv yourjar, you should see output like: 1183 Sun Jun 06 01:31:14 EDT 2010 org/apache/lucene/analysis/sinks/TokenTypeSinkTokenizer.class and your filter statement may need the whole path, in this example... filter class=org.apache.lucene.analysis.sinks.TokenTypeSink/ (note, this is just an example of the pathing, this class has nothing to do with your filter)... 2 But I'm guessing your path is actually OK, because I'd expect to be seeing a class not found error. So my guess is that your class depends on other jars that aren't packaged up in your jar and if you find which ones they are and copy them to your lib directory you'll be OK. Or your code is throwing
Facing problem with white space in synonyms
Query related to solr synonymfilterfactory. I am using Solr 1.4.1. I have datatype field textSynonym fieldType name=textSynonym class=solr.TextField positionIncrementGap=100 analyzer filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType field name=BODY type=textSynonym indexed=true stored=true/ The steps followed are : 1) synonym.txt has many words separated with a white space . Example : Hindclaw, Hind claw 2) Indexed a word - Hindclaw 3) In analysis page, searched this word. BODY(field name):Hindclaw 4) Output obtained for Hindclaw is Hindclaw, Hind and claw. It separated based on white space as well. Note: I have not used white space tokenizer for this data type. What is the error? Thanks and Regards, Rajani Maski
Re: Query regarding solr plugin.
Erick , * * * Thanks.* It was actually a copy mistake. Anyways i did a redo of all the below mentioned steps. I had given class name as filter class=pointcross.orchSynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ I did it again now following few different steps following this link : http://help.eclipse.org/helios/index.jsp?topic=/org.eclipse.jdt.doc.user/tasks/tasks-32.htm 1 ) Created new package in src folder . *org.apache.pointcross.synonym*.This is having class Synonym.java 2) Now did a right click on same package and selected export option-Java tab-JAR File-Selected the path for package - finish 3) This created jar file in specified location. Now followed in cmd , jar tfv org.apache.pointcross.synonym. the following was desc in cmd. :\Apps\Rajani Eclipse\Solr141_jarjar - tfv org.apache.pointcross.synonym.Synonym.jar 25 Mon Apr 25 11:32:12 GMT+05:30 2011 META-INF/MANIFEST.MF 383 Thu Apr 14 16:36:00 GMT+05:30 2011 .project 2261 Fri Apr 22 16:26:12 GMT+05:30 2011 .classpath 1017 Thu Apr 21 16:34:20 GMT+05:30 2011 jarLog.jardesc 4) Now placed same jar file in solr home/lib folder .Solrconfig.xml enabled lib dir=./lib / and in schema filter class=synonym.Synonym synonyms=synonyms.txt ignoreCase=true expand=true/ 5) Restart tomcat : http://localhost:8097/finding1 Error SEVERE: org.apache.solr.common.SolrException: Error loading class 'pointcross.synonym.Synonym' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:388) at org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:84) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141) at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:835) at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58) I am basically trying to enable this jar functionality to solr. Please let me know the mistake here. Rajani On Fri, Apr 22, 2011 at 6:29 PM, Erick Erickson erickerick...@gmail.comwrote: First I appreciate your writeup of the problem, it's very helpful when people take the time to put in the details I can't reconcile these two things: {{{filter class=org.apache.pco.search.orchSynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ as org.apache.solr.common.SolrException: Error loading class 'pointcross.orchSynonymFilterFactory' at}}} This seems to indicate that your config file is really looking for pointcross.orchSynonymFilterFactory rather than org.apachepco.search.orchSynonymFilterFactory. Do you perhaps have another definition in your config pointcross.orchSynonymFilterFactory? Try running jar -tfv your jar file to see what classes are actually defined in the file in the solr lib directory. Perhaps it's not what you expect (Perhaps Eclipse did something unexpected). Given the anomaly above (the error reported doesn't correspond to the class you defined) I'd also look to see if you have any old jars lying around that you somehow get to first. Finally, is there any chance that your pointcross.orchSynonymFilterFactory is a dependency of org.apachepco.search.orchSynonymFilterFactory? In which case Solr may be finding org.apachepco.search.orchSynonymFilterFactory but failing to load a dependency (that would have to be put in the lib or the jar). Hope that helps Erick On Fri, Apr 22, 2011 at 3:00 AM, rajini maski rajinima...@gmail.com wrote: One doubt regarding adding the solr plugin. I have a new java file created that includes few changes in SynonymFilterFactory.java. I want this java file to be added to solr instance. I created a package as : org.apache.pco.search This includes OrcSynonymFilterFactory java class extends BaseTokenFilterFactory implements ResourceLoaderAware {code.} Packages included: import org.apache.solr.analysis.*; import org.apache.lucene.analysis.Token; import org.apache.lucene.analysis.TokenStream; import org.apache.solr.common.ResourceLoader; import org.apache.solr.common.util.StrUtils; import org.apache.solr.util.plugin.ResourceLoaderAware; import java.io.File; import java.io.IOException; import java.io.Reader; import java.io.StringReader; import java.util.ArrayList; import java.util.List; I exported this java file in eclipse, selecting File tab-Export to package -org.apache.pco.search-OrchSynonymFilterFactory.java and generated jar file - org.apache.pco.orchSynonymFilterFactory.jar This jar file placed in /lib folder of solr home instance Changes in solr config - lib dir=./lib / Now i want to add this in schema fieldtype for synonym filter as filter class=org.apache.pco.search.orchSynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ But i am not able to do it.. It has an error
Re: Query regarding solr plugin.
Thanks Erick. I have added my replies to the points you did mention. I am somewhere going wrong. I guess do I need to club both the jars or something ? If yes, how do i do that? I have no much idea about java and jar files. Please guide me here. A couple of things to try. 1 when you do a 'jar -tfv yourjar, you should see output like: 1183 Sun Jun 06 01:31:14 EDT 2010 org/apache/lucene/analysis/sinks/TokenTypeSinkTokenizer.class and your filter statement may need the whole path, in this example... filter class=org.apache.lucene.analysis.sinks.TokenTypeSink/ (note, this is just an example of the pathing, this class has nothing to do with your filter)... I could see this output.. 2 But I'm guessing your path is actually OK, because I'd expect to be seeing a class not found error. So my guess is that your class depends on other jars that aren't packaged up in your jar and if you find which ones they are and copy them to your lib directory you'll be OK. Or your code is throwing an error on load. Or something like that... There is jar - apache-solr-core-1.4.1.jar this has the BaseTokenFilterFacotry class and the Synonymfilterfactory class..I made the changes in second class file and created it as new. Now i created a jar of that java file and placed this in solr home/lib and also placed apache-solr-core-1.4.1.jar file in lib folder of solr home. [solr home - c:\orch\search\solr lib path - c:\orch\search\solr\lib] 3 to try to understand what's up, I'd back up a step. Make a really stupid class that doesn't do anything except derive from BaseTokenFilterFacotry and see if you can load that. If you can, then your process is OK and you need to find out what classes your new filter depend on. If you still can't, then we can see what else we can come up with.. I am perhaps doing same. In the synonymfilterfactory class, there is a function parse rules which takes delimiters as one of the input parameter. Here i changed comma ',' to '~' tilde symbol and thats it. Regards, Rajani On Mon, Apr 25, 2011 at 6:26 PM, Erick Erickson erickerick...@gmail.comwrote: Looking at things more carefully, it may be one of your dependent classes that's not being found. A couple of things to try. 1 when you do a 'jar -tfv yourjar, you should see output like: 1183 Sun Jun 06 01:31:14 EDT 2010 org/apache/lucene/analysis/sinks/TokenTypeSinkTokenizer.class and your filter statement may need the whole path, in this example... filter class=org.apache.lucene.analysis.sinks.TokenTypeSink/ (note, this is just an example of the pathing, this class has nothing to do with your filter)... 2 But I'm guessing your path is actually OK, because I'd expect to be seeing a class not found error. So my guess is that your class depends on other jars that aren't packaged up in your jar and if you find which ones they are and copy them to your lib directory you'll be OK. Or your code is throwing an error on load. Or something like that... 3 to try to understand what's up, I'd back up a step. Make a really stupid class that doesn't do anything except derive from BaseTokenFilterFacotry and see if you can load that. If you can, then your process is OK and you need to find out what classes your new filter depend on. If you still can't, then we can see what else we can come up with.. Best Erick On Mon, Apr 25, 2011 at 2:34 AM, rajini maski rajinima...@gmail.com wrote: Erick , * * * Thanks.* It was actually a copy mistake. Anyways i did a redo of all the below mentioned steps. I had given class name as filter class=pointcross.orchSynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ I did it again now following few different steps following this link : http://help.eclipse.org/helios/index.jsp?topic=/org.eclipse.jdt.doc.user/tasks/tasks-32.htm 1 ) Created new package in src folder . *org.apache.pointcross.synonym*.This is having class Synonym.java 2) Now did a right click on same package and selected export option-Java tab-JAR File-Selected the path for package - finish 3) This created jar file in specified location. Now followed in cmd , jar tfv org.apache.pointcross.synonym. the following was desc in cmd. :\Apps\Rajani Eclipse\Solr141_jarjar - tfv org.apache.pointcross.synonym.Synonym.jar 25 Mon Apr 25 11:32:12 GMT+05:30 2011 META-INF/MANIFEST.MF 383 Thu Apr 14 16:36:00 GMT+05:30 2011 .project 2261 Fri Apr 22 16:26:12 GMT+05:30 2011 .classpath 1017 Thu Apr 21 16:34:20 GMT+05:30 2011 jarLog.jardesc 4) Now placed same jar file in solr home/lib folder .Solrconfig.xml enabled lib dir=./lib / and in schema filter class=synonym.Synonym synonyms=synonyms.txt ignoreCase=true expand=true/ 5) Restart tomcat : http://localhost:8097/finding1 Error SEVERE: org.apache.solr.common.SolrException: Error loading class 'pointcross.synonym.Synonym' at org.apache.solr.core.SolrResourceLoader.findClass
Query regarding solr plugin.
One doubt regarding adding the solr plugin. I have a new java file created that includes few changes in SynonymFilterFactory.java. I want this java file to be added to solr instance. I created a package as : org.apache.pco.search This includes OrcSynonymFilterFactory java class extends BaseTokenFilterFactory implements ResourceLoaderAware {code.} Packages included: import org.apache.solr.analysis.*; import org.apache.lucene.analysis.Token; import org.apache.lucene.analysis.TokenStream; import org.apache.solr.common.ResourceLoader; import org.apache.solr.common.util.StrUtils; import org.apache.solr.util.plugin.ResourceLoaderAware; import java.io.File; import java.io.IOException; import java.io.Reader; import java.io.StringReader; import java.util.ArrayList; import java.util.List; I exported this java file in eclipse, selecting File tab-Export to package -org.apache.pco.search-OrchSynonymFilterFactory.java and generated jar file - org.apache.pco.orchSynonymFilterFactory.jar This jar file placed in /lib folder of solr home instance Changes in solr config - lib dir=./lib / Now i want to add this in schema fieldtype for synonym filter as filter class=org.apache.pco.search.orchSynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ But i am not able to do it.. It has an error as org.apache.solr.common.SolrException: Error loading class 'pointcross.orchSynonymFilterFactory' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:388) at org.apache.solr.util.plugin.AbstractPluginLoader Please can anyone tell me , What is the mistake i am doing here and the fix for it ? Rajani
How to avoid Lock file generation - solr 1.4.1
I am using Solr 1.4.1(windows os) and below are the settings in my solr config file: writeLockTimeout1000/writeLockTimeout commitLockTimeout1/commitLockTimeout ramBufferSizeMB32/ramBufferSizeMB maxMergeDocs1/maxMergeDocs lockTypenative/lockType While writing the index, I am doing the post procedure.. posting the xml with solr/update http request. I am gettting the following error. SEVERE: Could not start SOLR. Check solr/home property java.nio.channels.OverlappingFileLockException at sun.nio.ch.FileChannelImpl$SharedFileLockTable.checkList(Unknown Source) at sun.nio.ch.FileChannelImpl$SharedFileLockTable.add(Unknown Source) at sun.nio.ch.FileChannelImpl.tryLock(Unknown Source) at java.nio.channels.FileChannel.tryLock(Unknown Source) at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:233) at org.apache.lucene.store.Lock.obtain(Lock.java:73) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1402) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:190) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173) at org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpdateHandler2.java:376) at org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:845) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486) at org.apache.solr.core.SolrCore.init(SolrCore.java:588) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4071) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4725) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:799) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:601) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:675) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:601) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:502) at org.apache.catalina.startup.HostConfig.check(HostConfig.java:1383) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:306) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142) at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1385) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1649) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1658) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1638) at java.lang.Thread.run(Unknown Source) What are the correct settings to be made for avoiding this lock file?
Error while performing facet search across shards..
An error while performing facet across shards..The following is the query: http://localhost:8090/InstantOne/select?/indent=on shards=localhost:8090/InstantOne,localhost:8091/InstantTwo ,localhost:8093/InstantThreeq=filenumber:10facet=onfacet.field=studyId No studyId fields are blank across any shards. I have apache solr 1.4.1 version set up for this. Error is : common.SolrException log SEVERE: java.lang.NullPointerException at org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:331) atorg.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:232 What might be the reason for this? Any particular configuration or set up needed to be done? Awaiting reply. Rajani
Query on facet field’s count
Query on facet field results... When I run a facet query on some field say : facet=on facet.field=StudyID I get list of distinct StudyID list with the count that tells that how many times did this study occur in the search query. But I also needed the count of these distinct StudyID list.. Any solr query to get count of it.. Example: lst name=*facet_fields* lst name= StudyID int name=*105*135164/int int name=*179*79820/int int name=*107*70815/int int name=*120*37076/int int name=*134*35276/int /lst /lst I wanted the count attribute that shall return the count of number of different studyID occurred .. In above example it could be : Count = 5 (105,179,107,120,134) lst name=*facet_fields* lst name= StudyID COUNT=5 int name=*105*135164/int int name=*179*79820/int int name=*107*70815/int int name=*120*37076/int int name=*134*35276/int /lst /lst
Index Defaults Section and main index section that is in solrconfig.xml
Any documentation on index Defaults Section and main index section that is in solrconfig.xml -- Solr1.4.1 I want to understand the terminology of these parameters and how are they inter connected?mergeFactor10/mergeFactorramBufferSizeMB32/ramBufferSizeMB maxBufferedDocs1000/maxBufferedDocs maxMergeDocs2147483647/maxMergeDocs *I read document in solr -wiki. From this I understand that,** if you set mergeFactor to 10, a new segment will be created on the disk for every 1000 (or maxBufferedDocs) documents added to the index. When the 10th segment of size 1000 is added, all 10 will be merged into a single segment of size 10,000 and likewise..power of 10How the parameter maxMergeDocs* act here and affects the index? And how does the ramBufferSize is checked? Any documentation would be great help! And what are the better solr caching parameters set for the same…Currently I have : queryResultCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ What are 512 ? 512 in KB or MB? ? Thanks and Regards,Rajani Maski
Re: Query on facet field¹s count
Thanks Bill Bell . .This query works after applying the patch you refered to, is it? Please can you let me know how do I need to update the current war (apache solr 1.4.1 )file with this new patch? Thanks a lot. Thanks, Rajani On Sat, Mar 12, 2011 at 8:56 AM, Bill Bell billnb...@gmail.com wrote: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=StudyIDface t.mincount=1facet.limit=-1f.StudyID.facet.namedistinct=1 Would do what you want I believe... On 3/11/11 8:51 AM, Bill Bell billnb...@gmail.com wrote: There is my patch to do that. SOLR-2242 Bill Bell Sent from mobile On Mar 11, 2011, at 1:34 AM, rajini maski rajinima...@gmail.com wrote: Query on facet field results... When I run a facet query on some field say : facet=on facet.field=StudyID I get list of distinct StudyID list with the count that tells that how many times did this study occur in the search query. But I also needed the count of these distinct StudyID list.. Any solr query to get count of it.. Example: lst name=*facet_fields* lst name= StudyID int name=*105*135164/int int name=*179*79820/int int name=*107*70815/int int name=*120*37076/int int name=*134*35276/int /lst /lst I wanted the count attribute that shall return the count of number of different studyID occurred .. In above example it could be : Count = 5 (105,179,107,120,134) lst name=*facet_fields* lst name= StudyID COUNT=5 int name=*105*135164/int int name=*179*79820/int int name=*107*70815/int int name=*120*37076/int int name=*134*35276/int /lst /lst
Re: Use of multiple tomcat instance and shards.
I have considered the RAM usage points of solr_wiki and yes,I have many facet queries fired every time and might be this is one of the reason .. I did give the Xmx-1024m and the error occurred but it was 2-3 times after many search queries fired.. But then the system slows down . So I needed any alternative. * * Tommaso, Please can you share any link that explains me about how to enable and do load balancing on the machines that you did mention above..? On Tue, Mar 8, 2011 at 4:11 PM, Jan Høydahl jan@cominvent.com wrote: Having 2Gb physical memory on the box I would allocate -Xmx1024m to Java as a starting point. The other thing you could do is try to trim your config to use less memory. Are you using many facets? String sorts? Wildcards? Fuzzy? Storing or returning more fields than needed? http://wiki.apache.org/solr/SolrPerformanceFactors#RAM_Usage_Considerations -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 8. mars 2011, at 07.40, rajini maski wrote: In order to increase the Java heap memory, I have only 2gb ram… so my default memory configuration is --JvmMs 128 --JvmMx 512 . I have the single solr data index upto 6gb. Now if I am trying to fire a search very often on this data index, after sometime I find an error as java heap space out of memory error and search does not return results. What are the possibilities to fix this error? (I cannot increase heap memory) How about having another tomcat instance running (how this works? )or is it by configuring shards? What is that might help me fix this search fail? Rajani
Re: Use of multiple tomcat instance and shards.
Thank you all . Tommaso , Thanks. I will follow the links you suggested. Erick, It is Solr 1.4.1 .. Regards, Rajani Maski On Tue, Mar 8, 2011 at 10:16 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Just one more hint, I didn't mention it in the previous email since I imagine the scenario you explained doesn't allow it but anyways you could also check Solr Cloud and its distributed requests [1]. Cheers, Tommaso [1] : http://wiki.apache.org/solr/SolrCloud#Distributed_Requests 2011/3/8 Tommaso Teofili tommaso.teof...@gmail.com Hi Rajani, i 2011/3/8 rajini maski rajinima...@gmail.com Tommaso, Please can you share any link that explains me about how to enable and do load balancing on the machines that you did mention above..? if you're querying Solr via SolrJ [1] you could use the LBHttpSolrServer [2] otherwise, if you still want Solr to be responsible for load balancing, implement a custom handler which wraps it (see [3]). Consider also that this load balancing often gets done using a VIP [4] or an Apache HTTP server in front of Solr. Hope this helps, Tommaso [1] : http://wiki.apache.org/solr/Solrj [2] : http://wiki.apache.org/solr/LBHttpSolrServer [3] : http://markmail.org/thread/25jrko5s7wlmzjf7 [4] : http://en.wikipedia.org/wiki/Virtual_IP_address
Use of multiple tomcat instance and shards.
In order to increase the Java heap memory, I have only 2gb ram… so my default memory configuration is --JvmMs 128 --JvmMx 512 . I have the single solr data index upto 6gb. Now if I am trying to fire a search very often on this data index, after sometime I find an error as java heap space out of memory error and search does not return results. What are the possibilities to fix this error? (I cannot increase heap memory) How about having another tomcat instance running (how this works? )or is it by configuring shards? What is that might help me fix this search fail? Rajani
Re: Full Text Search with multiple index and complex requirements
I just tried to answer your many questions, liking youe questions type.. Answers attached to questions.. Thank you Rajini, for your interest :) A) The data for every user is totally unrelated to every other user. This gives us few advantages: 1. we can keep our indexes small in size. (using cores) 2. merging/compatcting fragmented index will take less time. (merging is simple,one query) 3. if some indexes becomes inaccessible for whatever reason (corruption?), only those users gets affected. Other users are unaffected and the service is available for them. yes it affects only that index others are unaffected How many cores can we safely have on a machine ? How much is too much in this case ? B) Each user can have few different types of data. So, our index hierarchy will look something like: /user1/type1/index files /user1/type2/index files /user2/type1/index files /user3/type3/index files I am not clear with point here.. Example say you have 2users user1 types- Name , Emailaddress, Phone number user2 types- Name , Emailaddress, ID So you want to have user1 -3indexes plus user2-3indexes Total=6 indexes?? If user1 type phone number is only one type in data index-- Then schema will be having only one data type number type I just meant to say, like this : /myself/docs/index_docs /myself/spreadsheets/index_spreads /yourself/docs/index_docs /yourself/spreadsheets/index_spreads You get the idea right ? C) Often, probably with every itereation, we'll add types of data that can be indexed. So we want to have an efficient/programmatic way to add schemas for different types. We would like to avoid having fixed schema for indexing. you added a type say DATE Before you start indexing for this date type, u need to update your schema with this data type to enable indexing .. correct ? So this wont need a fixed schema defined priorly, we can add this only when you want to add this data type.. But this requires the service restart.. This wont effect current index other then adding to it.. Today I am adding only docs and spreadsheets, tomorrow I may want to add something else, something from RDBMS for example, then I don't want to sit tinkering with schema.xml and I wouldn't like a service restart either... -- On Fri, Mar 4, 2011 at 7:16 PM, Shrinath M shrinat...@webyog.com wrote: We are building an application which will require us to index data for each of our users so that we can provide full text search on their data. Here are some notable things about the application: A) The data for every user is totally unrelated to every other user. This gives us few advantages: 1. we can keep our indexes small in size. 2. merging/compatcting fragmented index will take less time. 3. if some indexes becomes inaccessible for whatever reason (corruption?), only those users gets affected. Other users are unaffected and the service is available for them. B) Each user can have few different types of data. We want to keep each type in separate folders, for the same reasons as above. So, our index hierarchy will look something like: /user1/type1/index files /user1/type2/index files /user2/type1/index files /user3/type3/index files C) Often, probably with every itereation, we'll add types of data that can be indexed. So we want to have an efficient/programmatic way to add schemas for different types. We would like to avoid having fixed schema for indexing. I like Lucene's schema-less way of indexing stuff. D) The users can fire search queries which will search either: - Within a specific type for that user - Across all types for that user: in this case we want to fire a parallel query like Lucene has. (ParallelMultiSearcher http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/ParallelMultiSearcher.html ) E) We require real time update for the index. *This is a must.* F) We are are planning to shard our index across multiple machines. For this also, we want: if a shard becomes inaccessible, only those users whose data are residing in that shard gets affected. Other users get uninterrupted service. We were considering Lucene, Sphinx and Solr to do this. This is what we found: - Sphinx: No efficient way to do A, B, C, F. Or is there? - Luecne: Everything looks possible, as it is very low level. But we have to write wrappers to do F and build a communication layer between the web server and the search server. - Solr: Not sure if we can do A, B, C easily. Can we? So, my question is what is the best software for the above requirements? I am inclined more towards Solr and then Lucene if we get all the requirements. -- Regards Shrinath.M
Re: Solr under Tomcat
Sai, The index directory will be in your Solr_home//Conf//data directory.. The path for this directory need to be given where ever you want to by changing the data-dir path in config XML that is present in the same //conf folder . You need to stop tomcat service to delete this directory and then restart tomcat. The tomcat itself generates the data folder at the path specified in config if this folder is not available. The folder usually has two sub-folders- index and spell-check Regards, Rajani Maski On Wed, Mar 2, 2011 at 7:39 PM, Thumuluri, Sai sai.thumul...@verizonwireless.com wrote: Good Morning, We have deployed Solr 1.4.1 under Tomcat and it works great, however I cannot find where the index (directory) is created. I set solr home in web.xml under /webapps/solr/WEB-INF/, but not sure where the data directory is. I have a need where I need to completely index the site and it would help for me to stop solr, delete index directory and restart solr prior to re-indexing the content. Thanks, Sai Thumuluri
Create a tomcat service.
Does anybody have a script to create a tomcat service? I'm trying to set my system up to run multiple instances of tomcat at the same time (on different ports, obviously), and can't get the service to create properly.I tried to follow the steps mentioned in this linkhttp://doc.ittrium.com/ittrium/visit/A1x66x1y1x10ddx1x68y1x1209x1x68y1x1214x1x7d.. But not successful in getting this thing done.. The service.bat file referring to an exe that is not available in the zip. Any help or suggestions? Thanks, Rajani.
Tomcat EXE Source Code
Can anybody help me to get the source code of the Tomcat exe file i.e, source code of the installation exe . Thanks..
Re: Tomcat EXE Source Code
I am trying to configure tomcat multi instances with that many number of services configured too. Right now that particular tomcat exe let create only one. If the same exe run again and tried to configure at other destination folder ,It throws an exception as service already exists.How can I fix this problem.. Any suggestions? On Fri, Feb 25, 2011 at 3:18 PM, Jan Høydahl jan@cominvent.com wrote: Why do you want it? Try asking on the Tomcat list :) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 25. feb. 2011, at 09.16, rajini maski wrote: Can anybody help me to get the source code of the Tomcat exe file i.e, source code of the installation exe . Thanks..
Re: Configure 2 or more Tomcat instances.
I created 2 tomcat instances. With respective folders tomcat0 tomcat1 And server xml edited with the different port numbers respecitvely(all the 3 ports). Now when I am tryin to connect .. http://localhost:8090/ or http://localhost:8091/ webpage failed to open in both the cases. Is there something else that i need to do? While I am trying to run the bootstrap.jar (present in //tomcat/bin/) through command prompt. I am getting an error - Run command: C:\Program Files\Apache Software Foundation\tomcat6.0\binjava -jar bootstrap.ja r Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/ca talina/startup/Bootstrap (Unsupported major.minor version 49.0) at java.lang.ClassLoader.defineClass0(Native Method) at java.lang.ClassLoader.defineClass(Unknown Source) at java.security.SecureClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.defineClass(Unknown Source) at java.net.URLClassLoader.access$100(Unknown Source) at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) Any idea why this error ? I have jdk1.6.0_02 and tomcat 6 version set up.. Regards Rajani Maski On Tue, Feb 22, 2011 at 7:53 PM, Paul Libbrecht p...@hoplahup.net wrote: Rajini, you need to make the (~3) ports defined in conf/server.xml different. paul Le 22 févr. 2011 à 12:15, rajini maski a écrit : I have a tomcat6.0 instance running in my system, with connector port-8090, shutdown port -8005 ,AJP/1.3 port-8009 and redirect port-8443 in server.xml (path = C:\Program Files\Apache Software Foundation\Tomcat 6.0\conf\server.xml) How do I configure one more independent tomcat instance in the same system..? I went through many sites.. but couldn't fix this. If anyone one know the proper configuration steps please reply.. Regards, Rajani Maski
Configure 2 or more Tomcat instances.
I have a tomcat6.0 instance running in my system, with connector port-8090, shutdown port -8005 ,AJP/1.3 port-8009 and redirect port-8443 in server.xml (path = C:\Program Files\Apache Software Foundation\Tomcat 6.0\conf\server.xml) How do I configure one more independent tomcat instance in the same system..? I went through many sites.. but couldn't fix this. If anyone one know the proper configuration steps please reply.. Regards, Rajani Maski
Re: Faceting Query
I am also working on same feature of solr 4.0 And I have doubt in the results am getting. I will post the cases here. If anyone know why is it so,Please revert back... I run a normal facet query with q parameter q=*:* and did facet=on facet.field=stockfacet.filed=placefacet.field=quantityfacet.mincout=1 Results i got is- facet_fields stock rice10/ bean10/ wheat10/ jowar10/ /stock place bangalore10/ Kolar10/ /place quality standard10/ high10/ /quality /facet_fields Now when I am doing this facet.pivot query with same q paramater (q= *:* )and same data set .. query - facet.pivot=stock,place,qualityfacet.mincout=1 Result I get is like this- lst rice bangalore high /lst lst bean bangalore standard /lst lst jowar /lst The point is .. Why I am not getting result hirearchy for wheat when it is coming in the flat faceting above. Awaiting reply Regards, Rajani Maski On Mon, Feb 14, 2011 at 4:18 PM, rajini maski rajinima...@gmail.com wrote: This feature works in SOLR 4.0 release. You can follow this link for knowing how it works... Click herehttp://solr.pl/en/2010/10/25/hierarchical-faceting-pivot-facets-in-trunk/ Regards Rajani Maski On Mon, Feb 14, 2011 at 4:05 PM, Isha Garg isha.g...@orkash.com wrote: On Friday 11 February 2011 11:34 PM, Gora Mohanty wrote: On Thu, Feb 10, 2011 at 12:21 PM, Isha Gargisha.g...@orkash.com wrote: What is facet.pivot field? PLz explain with example Does http://wiki.apache.org/solr/SimpleFacetParameters#facet.pivot not help? Regards, Gora No, it is not showing any pivot results in my case http://localhost:8984/solr/worldNews/select/?q=*%3A*version=2.2start=0rows=0indent=onfacet.pivot=category,country,KeyLocationfacet.pivot=country,categoryfacet=truefacet.field=categorywt=json Output is: { responseHeader:{ status:0, QTime:1, params:{ facet:true, indent:on, start:0, q:*:*, facet.field:category, wt:json, facet.pivot:[category,country,KeyLocation, country,category], version:2.2, rows:0}}, response:{numFound:6775,start:0,docs:[] }, facet_counts:{ facet_queries:{}, facet_fields:{ category:[ Counterfeiting and Piracy ,2367, Social Unrest ,2143, Security Measures ,1064, Fraud and Cheating,356, Naxelites ,266, Terrorism ,243, Sex Crime ,232, Shiv Sena ,76, Major Crime ,23, Drug Running and Organized Crime ,5]}, facet_dates:{}}}
Re: Solr -File Based Spell Check
Yeah.. I wanna use this Spell-check only.. I want to create myself the dictionary.. And give it as input to solr.. Because my indexes also have mis-spelled content and so I want solr to refer this file and not autogenrated. How do i get this done? I will try the spell check as suggested by michael... One more main thing I wanted to know is, how to extract the dictionary generated by default.? How do i read this .cfs files generated in index folder.. Please reply if you know anything related to this.. Awaiting reply On Mon, Dec 6, 2010 at 7:33 PM, Erick Erickson erickerick...@gmail.comwrote: Are you sure you want spellcheck/autosuggest? Because what you're talking about almost sounds like synonyms. Best Erick On Mon, Dec 6, 2010 at 1:37 AM, rajini maski rajinima...@gmail.com wrote: How does the solr file based spell check work? How do we need to enter data in the spelling.txt...I am not clear about its functionality..If anyone know..Please reply. I want to index a word = Wear But while searching I search as =Dress I want to get results for Wear.. How do i apply this functionality.. Awaiting Reply
Re: Solr -File Based Spell Check and Read .cfs file generated
Anyone know abt it? how to extract the dictionary generated by default.? How do i read this .cfs files generated in index folder.. Awaiting reply On Mon, Dec 6, 2010 at 7:54 PM, rajini maski rajinima...@gmail.com wrote: Yeah.. I wanna use this Spell-check only.. I want to create myself the dictionary.. And give it as input to solr.. Because my indexes also have mis-spelled content and so I want solr to refer this file and not autogenrated. How do i get this done? I will try the spell check as suggested by michael... One more main thing I wanted to know is, how to extract the dictionary generated by default.? How do i read this .cfs files generated in index folder.. Please reply if you know anything related to this.. Awaiting reply On Mon, Dec 6, 2010 at 7:33 PM, Erick Erickson erickerick...@gmail.comwrote: Are you sure you want spellcheck/autosuggest? Because what you're talking about almost sounds like synonyms. Best Erick On Mon, Dec 6, 2010 at 1:37 AM, rajini maski rajinima...@gmail.com wrote: How does the solr file based spell check work? How do we need to enter data in the spelling.txt...I am not clear about its functionality..If anyone know..Please reply. I want to index a word = Wear But while searching I search as =Dress I want to get results for Wear.. How do i apply this functionality.. Awaiting Reply
Solr -File Based Spell Check
How does the solr file based spell check work? How do we need to enter data in the spelling.txt...I am not clear about its functionality..If anyone know..Please reply. I want to index a word = Wear But while searching I search as =Dress I want to get results for Wear.. How do i apply this functionality.. Awaiting Reply
Re: Spell-Check Component Functionality
If any one know articles or blog on solr spell-check component configuration type..please let me know..solr-wiki not helping me solve maze.. On Fri, Nov 19, 2010 at 12:40 PM, rajini maski rajinima...@gmail.comwrote: And If I am trying to do : http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true q=Curst The XML OUTPUT IS -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Curst# response -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Curst# lst name=*responseHeader* int name=*status*0/int int name=*QTime*0/int -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Curst# lst name=*params* str name=*indent*on/str str name=*start*0/str str name=*q*Curst/str str name=*spellcheck.q*Curst/str str name=*rows*10/str str name=*version*2.2/str /lst /lst result name=*response* numFound=*0* start=*0* / /response No suggestion Tags also... If I am trying to do : http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true q=Crust The XML OUTPUT IS -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# response -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# lst name=*responseHeader* int name=*status*0/int int name=*QTime*0/int -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# lst name=*params* str name=*indent*on/str str name=*start*0/str str name=*q*Crust/str str name=*spellcheck.q*Curst/str str name=*rows*10/str str name=*version*2.2/str /lst /lst -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# result name=*response* numFound=*1* start=*0* -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# doc str name=*spell*Crust/str /doc /result /response No suggestion Tags.. What is the proper configuration for this? Is there any specific article written on spell check-solr other then in solr-wiki page..I am not getting clear idea about this component in solr-wiki.. Awaiting replies.. Rajani Maski On Fri, Nov 19, 2010 at 11:32 AM, rajini maski rajinima...@gmail.comwrote: Hello Peter, Thanks For reply :)I did spellcheck.q=Curst as you said ...Query is like: http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true I am getting this error :( HTTP Status 500 - null java.lang.NullPointerException at java.io.StringReader.init(Unknown Source) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:197) at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78) at org.apache.solr.search.QParser.getQuery(QParser.java:131) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at What is the error mean ... ? what do I need to do for this.. Any mistake in config? The config.xml and schema I have attached in the mail below FYI..Please let me know if anyone know why is this error.. Awaiting reply Rajani Maski On Thu, Nov 18, 2010 at 8:09 PM, Peter Karich peat...@yahoo.de wrote: Hi Rajani, some notes: * try spellcheck.q=curst or completely without spellcheck.q but with q * compared to the normal q parameter spellcheck.q can have a different analyzer/tokenizer and is used if present * do not do spellcheck.build=true for every request (creating the spellcheck index can be very expensive) * if you got spellcheck working embed the spellcheck component into your normal query component. otherwise you need to query 2 times ... Regards, Peter. All, I am trying apply the Solr spell check component functionality to our data. The configuration set up I needed to make for it by updating config.xml and schema.xml is done as follows.. Please let me know if any errors in it. I am not getting any suggestions in suggestion tags of solr output xml. I indexed word Crust to the field textSpell that is enabled for spell check and then I searched for Curst The queries i tried
Spell-Check Component Functionality
All, I am trying apply the Solr spell check component functionality to our data. The configuration set up I needed to make for it by updating config.xml and schema.xml is done as follows.. Please let me know if any errors in it. I am not getting any suggestions in suggestion tags of solr output xml. I indexed word Crust to the field textSpell that is enabled for spell check and then I searched for Curst The queries i tried were : http://localhost:8909/solr/spell?q=Curstspellcheck=truespellcheck.collate=truespellcheck.build=truespellcheck.q=true http://localhost:8909/solr/spell?q=Crustespellcheck=truespellcheck.collate=truespellcheck.build=truespellcheck.q=truespellcheck.dictionary=default The CONFIG.XML : searchComponent name=spellcheck class=solr.SpellCheckComponent lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=spellcheckIndexDir./spellchecker/str /lst !-- a spellchecker that uses a different distance measure -- lst name=spellchecker str name=namejarowinkler/str str name=fieldlowerfilt/str str name=distanceMeasureorg.apache.lucene.search.spell.JaroWinklerDistance/str str name=spellcheckIndexDir./spellchecker2/str /lst str name=queryAnalyzerFieldTypetextSpell/str /searchComponent requestHandler name=/spell class=solr.SearchHandler lazy=true lst name=defaults str name=spellcheck.dictionarydefault/str !-- omp = Only More Popular -- str name=spellcheck.onlyMorePopularfalse/str !-- exr = Extended Results -- str name=spellcheck.extendedResultsfalse/str !-- The number of suggestions to return -- str name=spellcheck.count1/str /lst arr name=last-components strspellcheck/str /arr /requestHandler SCHEMA: fieldType name=textSpell class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=spell type=textSpell indexed=true stored=true / If any error in above that is not enabling spell check please let me know. The output I am getting is like null suggetions lst Suggesstions/ /lst Regards, Rajani Maski
Re: Spell-Check Component Functionality
Hello Peter, Thanks For reply :)I did spellcheck.q=Curst as you said ...Query is like: http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true I am getting this error :( HTTP Status 500 - null java.lang.NullPointerException at java.io.StringReader.init(Unknown Source) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:197) at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78) at org.apache.solr.search.QParser.getQuery(QParser.java:131) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at What is the error mean ... ? what do I need to do for this.. Any mistake in config? The config.xml and schema I have attached in the mail below FYI..Please let me know if anyone know why is this error.. Awaiting reply Rajani Maski On Thu, Nov 18, 2010 at 8:09 PM, Peter Karich peat...@yahoo.de wrote: Hi Rajani, some notes: * try spellcheck.q=curst or completely without spellcheck.q but with q * compared to the normal q parameter spellcheck.q can have a different analyzer/tokenizer and is used if present * do not do spellcheck.build=true for every request (creating the spellcheck index can be very expensive) * if you got spellcheck working embed the spellcheck component into your normal query component. otherwise you need to query 2 times ... Regards, Peter. All, I am trying apply the Solr spell check component functionality to our data. The configuration set up I needed to make for it by updating config.xml and schema.xml is done as follows.. Please let me know if any errors in it. I am not getting any suggestions in suggestion tags of solr output xml. I indexed word Crust to the field textSpell that is enabled for spell check and then I searched for Curst The queries i tried were : http://localhost:8909/solr/spell?q=Curstspellcheck=truespellcheck.collate=truespellcheck.build=truespellcheck.q=true http://localhost:8909/solr/spell?q=Crustespellcheck=truespellcheck.collate=truespellcheck.build=truespellcheck.q=truespellcheck.dictionary=default The CONFIG.XML : searchComponent name=spellcheck class=solr.SpellCheckComponent lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=spellcheckIndexDir./spellchecker/str /lst !-- a spellchecker that uses a different distance measure -- lst name=spellchecker str name=namejarowinkler/str str name=fieldlowerfilt/str str name=distanceMeasureorg.apache.lucene.search.spell.JaroWinklerDistance/str str name=spellcheckIndexDir./spellchecker2/str /lst str name=queryAnalyzerFieldTypetextSpell/str /searchComponent requestHandler name=/spell class=solr.SearchHandler lazy=true lst name=defaults str name=spellcheck.dictionarydefault/str !-- omp = Only More Popular -- str name=spellcheck.onlyMorePopularfalse/str !-- exr = Extended Results -- str name=spellcheck.extendedResultsfalse/str !-- The number of suggestions to return -- str name=spellcheck.count1/str /lst arr name=last-components strspellcheck/str /arr /requestHandler SCHEMA: fieldType name=textSpell class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=spell type=textSpell indexed=true stored=true / If any error in above that is not enabling spell check please let me know. The output I am getting is like null suggetions lst Suggesstions/ /lst Regards, Rajani Maski -- http://jetwick.com twitter search prototype
Re: Spell-Check Component Functionality
And If I am trying to do : http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true q=Curst The XML OUTPUT IS -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Curst# response -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Curst# lst name=*responseHeader* int name=*status*0/int int name=*QTime*0/int -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Curst# lst name=*params* str name=*indent*on/str str name=*start*0/str str name=*q*Curst/str str name=*spellcheck.q*Curst/str str name=*rows*10/str str name=*version*2.2/str /lst /lst result name=*response* numFound=*0* start=*0* / /response No suggestion Tags also... If I am trying to do : http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true q=Crust The XML OUTPUT IS -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# response -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# lst name=*responseHeader* int name=*status*0/int int name=*QTime*0/int -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# lst name=*params* str name=*indent*on/str str name=*start*0/str str name=*q*Crust/str str name=*spellcheck.q*Curst/str str name=*rows*10/str str name=*version*2.2/str /lst /lst -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# result name=*response* numFound=*1* start=*0* -http://localhost:8090/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onq=Crust# doc str name=*spell*Crust/str /doc /result /response No suggestion Tags.. What is the proper configuration for this? Is there any specific article written on spell check-solr other then in solr-wiki page..I am not getting clear idea about this component in solr-wiki.. Awaiting replies.. Rajani Maski On Fri, Nov 19, 2010 at 11:32 AM, rajini maski rajinima...@gmail.comwrote: Hello Peter, Thanks For reply :)I did spellcheck.q=Curst as you said ...Query is like: http://localhost:8909/solr/select/?spellcheck.q=Curstversion=2.2start=0rows=10indent=onspellcheck=true I am getting this error :( HTTP Status 500 - null java.lang.NullPointerException at java.io.StringReader.init(Unknown Source) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:197) at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78) at org.apache.solr.search.QParser.getQuery(QParser.java:131) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at What is the error mean ... ? what do I need to do for this.. Any mistake in config? The config.xml and schema I have attached in the mail below FYI..Please let me know if anyone know why is this error.. Awaiting reply Rajani Maski On Thu, Nov 18, 2010 at 8:09 PM, Peter Karich peat...@yahoo.de wrote: Hi Rajani, some notes: * try spellcheck.q=curst or completely without spellcheck.q but with q * compared to the normal q parameter spellcheck.q can have a different analyzer/tokenizer and is used if present * do not do spellcheck.build=true for every request (creating the spellcheck index can be very expensive) * if you got spellcheck working embed the spellcheck component into your normal query component. otherwise you need to query 2 times ... Regards, Peter. All, I am trying apply the Solr spell check component functionality to our data. The configuration set up I needed to make for it by updating config.xml and schema.xml is done as follows.. Please let me know if any errors in it. I am not getting any suggestions in suggestion tags of solr output xml. I indexed word Crust to the field textSpell that is enabled for spell check and then I searched for Curst The queries i tried were : http://localhost:8909/solr/spell?q=Curstspellcheck=truespellcheck.collate=truespellcheck.build=truespellcheck.q=true http://localhost:8909/solr/spell?q=Crustespellcheck=truespellcheck.collate=truespellcheck.build=truespellcheck.q=truespellcheck.dictionary=default The CONFIG.XML
Re: Looking for Developers
Its better if we can make some solr-job list.. that would be better.. if not, chances that this mailing list of solr queries become less of that and more lik job forum.. this mailing list is so uselful to all developers to get answers for their techinical queries.. On Thu, Oct 28, 2010 at 11:30 PM, Stefan Moises moi...@shoptimax.de wrote: Well, I don't see a problem sending (serious) job offers to this list... as long as nobody spams just my 2c Stefan Am 28.10.2010 19:57, schrieb Ravi Gidwani: May I suggest a new mailing list like solr-jobs (if it does not exist) or something for such emails ? I think it is also important for the solr developers to get emails about job opportunities ? No ? ~Ravi. On Tue, Oct 26, 2010 at 11:42 PM, Pradeep Singhpksing...@gmail.com wrote: This is the second time he has sent this shit. Kill his subscription. Is it possible? On Tue, Oct 26, 2010 at 10:38 PM, Yuchen Wangyuc...@trulia.com wrote: UNSUBSCRIBE On Tue, Oct 26, 2010 at 10:15 PM, Igor Chudovichu...@gmail.com wrote: UNSUBSCRIBE On Wed, Oct 27, 2010 at 12:14 AM, ST STstst2...@gmail.com wrote: Looking for Developers Experienced in Solr/Lucene And/OR FAST Search Engines from India (Pune) We are looking for off-shore India Based Developers who are proficient in Solr/Lucene and/or FAST search engine . Developers in the cities of Pune/Bombay in India are preferred. Development is for projects based in US for a reputed firm. If you are proficient in Solr/Lucene/FAST and have 5 years minimum industry experience with atleast 3 years in Search Development, please send me your resume. Thanks -- *** Stefan Moises Senior Softwareentwickler shoptimax GmbH Guntherstraße 45 a 90461 Nürnberg Amtsgericht Nürnberg HRB 21703 GF Friedrich Schreieck Tel.: 0911/25566-25 Fax: 0911/25566-29 moi...@shoptimax.de http://www.shoptimax.de ***
Logic behind Solr creating files in .../data/index path.
All, While we post data to Solr... The data get stored in //data/index path in some multiple files with different file extensions... Not worrying about the extensions, I want to know how are these number of files created ? Does anyone know on what logic are these multiple index files created in data/index path ... ? If we do an optimize , The number of files get reduced... Else, say some N number of files are created.. Based on what parameter it creates? And how are the sizes of file varies there? Hope I am clear about the doubt I have...
Re: stream.url problem
If the connector port number in your localhost is same as in other system then this error is probable..You can change port number in server.xml of your system or other system and make them different...If it is different only then one other probablity is remote access enabled or not... Rajani Maski 2010/8/17 Tim Terlegård tim.terleg...@gmail.com hi all, i am indexing the documents to solr that are in my system. now i need to index the files that are in remote system, i enabled the remote streaming to true in solrconfig.xml and when i use the stream.url it shows the error as connection refused and the detail of the error is::: when i sent the request in my browser as:: http://localhost:8080/solr/update/extract?stream.url=http://remotehost/home/san/Desktop/programming_erlang_armstrong.pdfliteral.id=schb2 You probably use the wrong port. Try 8983 instead. /Tim
Re: OutOfMemoryErrors
I am getting it while indexing data to solr not while querying... Though I have enough memory space upto 40GB and I my indexing data is just 5-6 GB yet that particular error is seldom observed... (SEVERE ERROR : JAVA HEAP SPACE , OUT OF MEMORY ERROR ) I could see one lock file generated in the data/index path just after this error. On Tue, Aug 17, 2010 at 4:49 PM, Peter Karich peat...@yahoo.de wrote: Is there a way to verify that I have added correctlly? on linux you can do ps -elf | grep Boot and see if the java command has the parameters added. @all: why and when do you get those OOMs? while querying? which queries in detail? Regards, Peter.
Re: OutOfMemoryErrors
mergefactor100 /mergefactor JVM Initial memory pool -256MB Maximum memory pool -1024MB add doc fieldlong:ID/field fieldstr:Body/field 12 fields /filed /doc /add I have a solr instance in solr folder (D:/Solr) free space in disc is 24.3GB .. How will I get to know what portion of memory is solr using ? On Tue, Aug 17, 2010 at 10:11 PM, Erick Erickson erickerick...@gmail.comwrote: You shouldn't be getting this error at all unless you're doing something out of the ordinary. So, it'd help if you told us: What parameters you have set for merging What parameters you have set for the JVM What kind of documents are you indexing? The memory you have is irrelevant if you only allocate a small portion of it for the running process... Best Erick On Tue, Aug 17, 2010 at 7:35 AM, rajini maski rajinima...@gmail.com wrote: I am getting it while indexing data to solr not while querying... Though I have enough memory space upto 40GB and I my indexing data is just 5-6 GB yet that particular error is seldom observed... (SEVERE ERROR : JAVA HEAP SPACE , OUT OF MEMORY ERROR ) I could see one lock file generated in the data/index path just after this error. On Tue, Aug 17, 2010 at 4:49 PM, Peter Karich peat...@yahoo.de wrote: Is there a way to verify that I have added correctlly? on linux you can do ps -elf | grep Boot and see if the java command has the parameters added. @all: why and when do you get those OOMs? while querying? which queries in detail? Regards, Peter.
Re: Solr-HOW TO HANDLE THE LOCK FILE CREATION WHILE INDEXING AND OPERATION TIMED OUT WEB EXCEPTION ERROR
Yes it is netwoked kind and in WindowsSolr version is Solr-1.4.0 , Tomcat 6. Exception is system.net.web exception error Operation has timed out httprequest.getresponse failed For web exception error do I need to change ramBufferSize paramter and merge factors parameters in config.xml ?? And for lock file is there any setting I need to make? Why and how does it get generated...?If you know please brief it...I am not able to get it understand Thanks a lot for reply... Regards, Rajani Maski On Tue, Aug 17, 2010 at 9:41 PM, Erick Erickson erickerick...@gmail.comwrote: It would help a lot if you included the stack trace of the exception, perhaps it'll be in your SOLR logs. Also, what is your environment? Are you using any kind of networked drive for your index? Windows? What version of SOLR are you using? Anything else you think would be useful. Best Erick On Tue, Aug 17, 2010 at 12:10 AM, rajini maski rajinima...@gmail.com wrote: Hello Everyone, Please help me knowing the logic behind this lock file generation while indexing data in solr! The trouble I am facing is as follows: The data that I indexed is nearly in millions. At the initial level of indexing I find no errors unless it cross up-to 10lacs documents...But once it crosses this limit its throwing the web exception error as operation time out! And simultaneously a kind of LOCK file is generated in //data/index folder. I found in one thread ( this thread http://www.mail-archive.com/solr-user@lucene.apache.org/msg06782.html )that it can be fixed by making some changes in Config xml of solr and also by increasing java memory space in Tomcat.And I did that...Still the issue is not solved and i couldn't find any route cause for this error.. Please , whoever know logic behind these two issues i.e, 1) The web exception error as *operation timed out * 2) The logic behind* why lock files are created and how they actually work like!!* Awaiting replies Regards, Rajani Maski
Re: OutOfMemoryErrors
yeah sorry I forgot to mention others... mergeFactor100/mergeFactor maxBufferedDocs1000/maxBufferedDocs maxMergeDocs10/maxMergeDocs maxFieldLength1/maxFieldLength above are the values Is this because of values here...initially I had mergeFactor parameter -10 and maxMergedocs-1With the same error i changed them to above values..Yet I got that error after index was about 2lacs docs... On Tue, Aug 17, 2010 at 11:04 PM, Erick Erickson erickerick...@gmail.comwrote: There are more merge paramaters, what values do you have for these: mergeFactor10/mergeFactor maxBufferedDocs1000/maxBufferedDocs maxMergeDocs2147483647/maxMergeDocs maxFieldLength1/maxFieldLength See: http://wiki.apache.org/solr/SolrConfigXml Hope that formatting comes through the various mail programs OK Also, what else happens while you're indexing? Do you search while indexing? How often do you commit your changes? On Tue, Aug 17, 2010 at 1:18 PM, rajini maski rajinima...@gmail.com wrote: mergefactor100 /mergefactor JVM Initial memory pool -256MB Maximum memory pool -1024MB add doc fieldlong:ID/field fieldstr:Body/field 12 fields /filed /doc /add I have a solr instance in solr folder (D:/Solr) free space in disc is 24.3GB .. How will I get to know what portion of memory is solr using ? On Tue, Aug 17, 2010 at 10:11 PM, Erick Erickson erickerick...@gmail.com wrote: You shouldn't be getting this error at all unless you're doing something out of the ordinary. So, it'd help if you told us: What parameters you have set for merging What parameters you have set for the JVM What kind of documents are you indexing? The memory you have is irrelevant if you only allocate a small portion of it for the running process... Best Erick On Tue, Aug 17, 2010 at 7:35 AM, rajini maski rajinima...@gmail.com wrote: I am getting it while indexing data to solr not while querying... Though I have enough memory space upto 40GB and I my indexing data is just 5-6 GB yet that particular error is seldom observed... (SEVERE ERROR : JAVA HEAP SPACE , OUT OF MEMORY ERROR ) I could see one lock file generated in the data/index path just after this error. On Tue, Aug 17, 2010 at 4:49 PM, Peter Karich peat...@yahoo.de wrote: Is there a way to verify that I have added correctlly? on linux you can do ps -elf | grep Boot and see if the java command has the parameters added. @all: why and when do you get those OOMs? while querying? which queries in detail? Regards, Peter.
Re: OutOfMemoryErrors
Yeah fine..I will do that...Before the merge Factor was 10 itself ...After finding this error I just set its value higher assuming if that could be error anyway... Will re change it.. The ramBufferSize is 256MB... Do I need to change this value to higher? On Wed, Aug 18, 2010 at 12:27 AM, Jay Hill jayallenh...@gmail.com wrote: A merge factor of 100 is very high and out of the norm. Try starting with a value of 10. I've never seen a running system with a value anywhere near this high. Also, what is your setting for ramBufferSizeMB? -Jay On Tue, Aug 17, 2010 at 10:46 AM, rajini maski rajinima...@gmail.com wrote: yeah sorry I forgot to mention others... mergeFactor100/mergeFactor maxBufferedDocs1000/maxBufferedDocs maxMergeDocs10/maxMergeDocs maxFieldLength1/maxFieldLength above are the values Is this because of values here...initially I had mergeFactor parameter -10 and maxMergedocs-1With the same error i changed them to above values..Yet I got that error after index was about 2lacs docs... On Tue, Aug 17, 2010 at 11:04 PM, Erick Erickson erickerick...@gmail.com wrote: There are more merge paramaters, what values do you have for these: mergeFactor10/mergeFactor maxBufferedDocs1000/maxBufferedDocs maxMergeDocs2147483647/maxMergeDocs maxFieldLength1/maxFieldLength See: http://wiki.apache.org/solr/SolrConfigXml Hope that formatting comes through the various mail programs OK Also, what else happens while you're indexing? Do you search while indexing? How often do you commit your changes? On Tue, Aug 17, 2010 at 1:18 PM, rajini maski rajinima...@gmail.com wrote: mergefactor100 /mergefactor JVM Initial memory pool -256MB Maximum memory pool -1024MB add doc fieldlong:ID/field fieldstr:Body/field 12 fields /filed /doc /add I have a solr instance in solr folder (D:/Solr) free space in disc is 24.3GB .. How will I get to know what portion of memory is solr using ? On Tue, Aug 17, 2010 at 10:11 PM, Erick Erickson erickerick...@gmail.com wrote: You shouldn't be getting this error at all unless you're doing something out of the ordinary. So, it'd help if you told us: What parameters you have set for merging What parameters you have set for the JVM What kind of documents are you indexing? The memory you have is irrelevant if you only allocate a small portion of it for the running process... Best Erick On Tue, Aug 17, 2010 at 7:35 AM, rajini maski rajinima...@gmail.com wrote: I am getting it while indexing data to solr not while querying... Though I have enough memory space upto 40GB and I my indexing data is just 5-6 GB yet that particular error is seldom observed... (SEVERE ERROR : JAVA HEAP SPACE , OUT OF MEMORY ERROR ) I could see one lock file generated in the data/index path just after this error. On Tue, Aug 17, 2010 at 4:49 PM, Peter Karich peat...@yahoo.de wrote: Is there a way to verify that I have added correctlly? on linux you can do ps -elf | grep Boot and see if the java command has the parameters added. @all: why and when do you get those OOMs? while querying? which queries in detail? Regards, Peter.
Solr-HOW TO HANDLE THE LOCK FILE CREATION WHILE INDEXING AND OPERATION TIMED OUT WEB EXCEPTION ERROR
Hello Everyone, Please help me knowing the logic behind this lock file generation while indexing data in solr! The trouble I am facing is as follows: The data that I indexed is nearly in millions. At the initial level of indexing I find no errors unless it cross up-to 10lacs documents...But once it crosses this limit its throwing the web exception error as operation time out! And simultaneously a kind of LOCK file is generated in //data/index folder. I found in one thread ( this threadhttp://www.mail-archive.com/solr-user@lucene.apache.org/msg06782.html )that it can be fixed by making some changes in Config xml of solr and also by increasing java memory space in Tomcat.And I did that...Still the issue is not solved and i couldn't find any route cause for this error.. Please , whoever know logic behind these two issues i.e, 1) The web exception error as *operation timed out * 2) The logic behind* why lock files are created and how they actually work like!!* Awaiting replies Regards, Rajani Maski
Re: OutOfMemoryErrors
Hello There, Even I am facing same errors... @Grijesh, Where exactly I need to make these changes of increasing JVM heap space..I mean where i need to specify them... ? I had made changes in tomcat config Java(JVM) initial memory pool and maximum memory pool to 256-1024MB..Yet the error persists in same frequency :( On Tue, Aug 17, 2010 at 10:42 AM, Grijesh.singh pintu.grij...@gmail.comwrote: increase your JVM Heap space by using params -Xms1024m -Xmx4096m Like this. -- View this message in context: http://lucene.472066.n3.nabble.com/OutOfMemoryErrors-tp1181731p1181892.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problems running on tomcat
I have observed this error while there is mistake in indexed fields.. i.e; The field defined while indexing but undefined in schema... then this error is thrown.. You can check for that missing filed in your Catalina logs.. There it will be written as Unknown filed.. Regards, Rajani Maski On Tue, Aug 3, 2010 at 2:09 AM, Claudio Devecchi cdevec...@gmail.comwrote: Hi Ahmet, Works with tomcat6. Tks! On Mon, Aug 2, 2010 at 3:04 PM, Claudio Devecchi cdevec...@gmail.com wrote: Hi Ahmet, I'm using tomcat7 with solr 1.4.1 =\ If you worked fine on tomcat6 I'll try with the same version... Tks for helping.. On Mon, Aug 2, 2010 at 2:30 PM, Ahmet Arslan iori...@yahoo.com wrote: What is version of solr and tomcat? I think i saw same problem with tomcat 7 and solr 1.4.1 combination, thats why i am asking. I just tried to replicate this problem with tomcat 6 and solr 1.4.1, but everything went fine. --- On Fri, 7/30/10, Claudio Devecchi cdevec...@gmail.com wrote: From: Claudio Devecchi cdevec...@gmail.com Subject: Problems running on tomcat To: solr-user@lucene.apache.org Date: Friday, July 30, 2010, 10:17 PM Hi, I'm new with solr and I'm doing my first installation under tomcat, I followed the documentation on link ( http://wiki.apache.org/solr/SolrTomcat#Installing_Tomcat_6) but there are some problems. The http://localhost:8080/solr/admin works fine, but in some cases, for example to see my schema.xml from the admin console the error bellow happensHTTP Status 404 - /solr/admin/file/index.jspSomebody already saw this? There are some trick to do? Tks -- Claudio Devecchi -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi
Re: logic required for newbie
yes.. The above solution would help ..:) you can specify like http://localhost:8090/solr/select?indent=onstart=0rows=10q=landmark:landmark4fl=landmark,user_id this will give u for each results set only landmark field and userId And in solr console ,The Full Interface option, There you can try for useage of highlight... Regrds, Rajani Maski On Thu, Jul 29, 2010 at 1:01 PM, Bastian Spitzer bspit...@magix.net wrote: You cant really. By searching you allways will find _documents_, and solr will return all their stored fields unless you specify which exact stored fields you want solr to return by passing fl= parameter to your query. The only aproach i can think off is (mis)using highlighting, search for hightlighted text in the landmarkX- fields and then remove the fields that dont contain matches. Just add: hl=truehl.fl=landmark1,landmark2,landmark3 etc to your query, then you will find a highlighting section in your response. hope that helps -Ursprüngliche Nachricht- Von: Jonty Rhods [mailto:jonty.rh...@gmail.com] Gesendet: Donnerstag, 29. Juli 2010 08:20 An: solr-user@lucene.apache.org Betreff: Re: logic required for newbie Again thanks for reply.. Actually I am getting result. But I am getting all column of the rows. I want to remove unnecessary column. In case of q=piza hut.. then I want to get only landmark4piza hut/landmark4. Same if search query change to ford motor then want only landmark5ford motor/landmark5. more example if query is piza hut ford motor then expected result should be.. id1/id namesome name/name user_iduser_id/user_id locationnew york/location countryUSA/country landmark4piza hut/landmark4 landmark5ford motor/landmark5 In above expected result.. landmark15th avenue/landmark1, landmark2ms departmental store/landmark2, landmark3base bakery/landmark3 , has been removed because it not carrying any matched text.. More generalized form I want to filter all unmatched column which not carrying matched query. Right now I am getting proper result but getting full column set. My requirement is only match landmark should return.. So I want to filter the column which carry text (match to query). hoping someone will help me to clear my concept.. regards On Thu, Jul 29, 2010 at 9:41 AM, rajini maski rajinima...@gmail.com wrote: First of all I hope that in schema you have mentioned for fields indexed=true and stored=true... Next if you have done so... and now just search as q=landmark:piza... you will get one result set only.. Note : There is one constraint about applying analyzers and tokenizers... IF you apply white space tokenizer...that is , data type=text_ws. The only you will get result set of piza hut even when you query for piza... If no tokenizer applied..You will not get it... I hope this was needed reply..If something elseyou can easy question..;) On Wed, Jul 28, 2010 at 8:42 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi thanks for reply.. Actually requirement is diffrent (sorry if I am unable to clerify in first mail). basically follwoing are the fields name in schema as well: 1. id 2. name 3. user_id 4. location 5. country 6. landmark1 7. landmark2 8. landmark3 9. landmark4 10. landmark5 which carrying text... for example: id1/id namesome name/name user_iduser_id/user_id locationnew york/location countryUSA/country landmark15th avenue/landmark1 landmark2ms departmental store/landmark2 landmark3base bakery/landmark3 landmark4piza hut/landmark4 landmark5ford motor/landmark5 now if user search by piza then expected result like: id1/id namesome name/name user_iduser_id/user_id locationnew york/location countryUSA/country landmark4piza hut/landmark4 it means I want to ignore all other landmark which not match. By filter we can filter the fields but here I dont know the the field name because it depends on text match. is there any other solution.. I am ready to change in schema or in logic. I am using solrj. please help me I stuck here.. with regards On Wed, Jul 28, 2010 at 7:22 PM, rajini maski rajinima...@gmail.com wrote: you can index each of these field separately... field1- Id field2- name field3-user_id field4-country. field7- landmark While quering you can specify q=Landmark9 This will return you results.. And if you want only particular fields in output.. use the fl parameter in query... like http://localhost:8090/solr/select? indent=onq=landmark9fl=ID,user_id,country,landmark This will give your desired solution.. On Wed, Jul 28, 2010 at 12:23 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi All, I am very new and learning solr
Re: logic required for newbie
you can index each of these field separately... field1- Id field2- name field3-user_id field4-country. field7- landmark While quering you can specify q=Landmark9 This will return you results.. And if you want only particular fields in output.. use the fl parameter in query... like http://localhost:8090/solr/select? indent=onq=landmark9fl=ID,user_id,country,landmark This will give your desired solution.. On Wed, Jul 28, 2010 at 12:23 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi All, I am very new and learning solr. I have 10 column like following in table 1. id 2. name 3. user_id 4. location 5. country 6. landmark1 7. landmark2 8. landmark3 9. landmark4 10. landmark5 when user search for landmark then I want to return only one landmark which match. Rest of the landmark should ingnored.. expected result like following if user search by landmark2.. 1. id 2. name 3. user_id 4. location 5. country 7. landmark2 or if search by landmark9 1. id 2. name 3. user_id 4. location 5. country 9. landmark9 please help me to design the schema for this kind of requirement... thanks with regards
Re: logic required for newbie
First of all I hope that in schema you have mentioned for fields indexed=true and stored=true... Next if you have done so... and now just search as q=landmark:piza... you will get one result set only.. Note : There is one constraint about applying analyzers and tokenizers... IF you apply white space tokenizer...that is , data type=text_ws. The only you will get result set of piza hut even when you query for piza... If no tokenizer applied..You will not get it... I hope this was needed reply..If something elseyou can easy question..;) On Wed, Jul 28, 2010 at 8:42 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi thanks for reply.. Actually requirement is diffrent (sorry if I am unable to clerify in first mail). basically follwoing are the fields name in schema as well: 1. id 2. name 3. user_id 4. location 5. country 6. landmark1 7. landmark2 8. landmark3 9. landmark4 10. landmark5 which carrying text... for example: id1/id namesome name/name user_iduser_id/user_id locationnew york/location countryUSA/country landmark15th avenue/landmark1 landmark2ms departmental store/landmark2 landmark3base bakery/landmark3 landmark4piza hut/landmark4 landmark5ford motor/landmark5 now if user search by piza then expected result like: id1/id namesome name/name user_iduser_id/user_id locationnew york/location countryUSA/country landmark4piza hut/landmark4 it means I want to ignore all other landmark which not match. By filter we can filter the fields but here I dont know the the field name because it depends on text match. is there any other solution.. I am ready to change in schema or in logic. I am using solrj. please help me I stuck here.. with regards On Wed, Jul 28, 2010 at 7:22 PM, rajini maski rajinima...@gmail.com wrote: you can index each of these field separately... field1- Id field2- name field3-user_id field4-country. field7- landmark While quering you can specify q=Landmark9 This will return you results.. And if you want only particular fields in output.. use the fl parameter in query... like http://localhost:8090/solr/select? indent=onq=landmark9fl=ID,user_id,country,landmark This will give your desired solution.. On Wed, Jul 28, 2010 at 12:23 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi All, I am very new and learning solr. I have 10 column like following in table 1. id 2. name 3. user_id 4. location 5. country 6. landmark1 7. landmark2 8. landmark3 9. landmark4 10. landmark5 when user search for landmark then I want to return only one landmark which match. Rest of the landmark should ingnored.. expected result like following if user search by landmark2.. 1. id 2. name 3. user_id 4. location 5. country 7. landmark2 or if search by landmark9 1. id 2. name 3. user_id 4. location 5. country 9. landmark9 please help me to design the schema for this kind of requirement... thanks with regards
Re: SolrJ Response + JSON
Yeah right... This query will do it http://localhost:8090/solr/select/?q=*:*version=2.2start=0rows=10indent=onwt=json This will do your work... This is more liike using xsl transformation supported by solr..:) Regards, Rajani Maski On Wed, Jul 28, 2010 at 6:24 PM, Mark Allan mark.al...@ed.ac.uk wrote: I think you should just be able to add wt=json to the end of your query (or change whatever the existing wt parameter is in your URL). Mark On 28 Jul 2010, at 12:54 pm, MitchK wrote: Hello community, I need to transform SolrJ - responses into JSON, after some computing on those results by another application has finished. I can not do those computations on the Solr - side. So, I really have to translate SolrJ's output into JSON. Any experiences how to do so without writing your own JSON-writer? Thank you. - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-Response-JSON-tp1002024p1002024.html Sent from the Solr - User mailing list archive at Nabble.com. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Re: Tree Faceting in Solr 1.4
I am also looking out for same feature in Solr and very keen to know whether it supports this feature of tree faceting... Or we are forced to index in tree faceting formatlike 1/2/3/4 1/2/3 1/2 1 In-case of multilevel faceting it will give only 2 level tree facet is what i found.. If i give query as : country India and state Karnataka and city bangalore...All what i want is a facet count 1) for condition above. 2) The number of states in that Country 3) the number of cities in that state ... Like = Country: India ,State:Karnataka , City: Bangalore 1 State:Karnataka Kerla Tamilnadu Andra Pradesh...and so on City: Mysore Hubli Mangalore Coorg and so on... If I am doing facet=on facet.field={!ex=State}State fq={!tag=State}State:Karnataka All it gives me is Facets on state excluding only that filter query.. But i was not able to do same on third level ..Like facet.field= Give me the counts of cities also in state Karantaka.. Let me know solution for this... Regards, Rajani Maski On Thu, Jul 22, 2010 at 10:13 PM, Eric Grobler impalah...@googlemail.comwrote: Thank you for the link. I was not aware of the multifaceting syntax - this will enable me to run 1 less query on the main page! However this is not a tree faceting feature. Thanks Eric On Thu, Jul 22, 2010 at 4:51 PM, SR r.steve@gmail.com wrote: Perhaps the following article can help: http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html -S On Jul 22, 2010, at 5:39 PM, Eric Grobler wrote: Hi Solr Community If I have: COUNTRY CITY Germany Berlin Germany Hamburg Spain Madrid Can I do faceting like: Germany Berlin Hamburg Spain Madrid I tried to apply SOLR-792 to the current trunk but it does not seem to be compatible. Maybe there is a similar feature existing in the latest builds? Thanks Regards Eric