Re: Searching for string having apostrophe
Hi, Apostrophes are not part of the special query parser characters. You don't need to escape it. Can you give some examples ? On Tuesday, June 17, 2014 8:35 AM, Gaurav Deshpande imgaur...@yahoo.in wrote: Hi, I want to perform name searches in Solr on String and text datatypes but names contain apostrophes in it. Is there a way I can escape these apostrophes and perform searches ? Using '\' before apostrophe results in forbidden access due to cross site scripting attacks. Any help or pointers regarding this would be appreciated. Thanks, Gaurav
Re: Warning message logs on startup after upgrading to 4.8.1
Hi, I think it would attract more attention if title mentions about 'managed resource warn logs' or something like that. AHmet On Tuesday, June 17, 2014 8:12 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: On Thu, Jun 12, 2014 at 11:16 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: Hi guys, After I upgraded to Solr 4.8.1 I got a few warning messages in the log at startup: WARN o.a.s.c.SolrResourceLoader - Solr loaded a deprecated plugin/analysis class [solr.ThaiWordFilterFactory]. Please consult documentation how to replace it accordingly. I fixed this with https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L1159 WARN o.a.s.r.ManagedResource - No stored data found for /schema/analysis/stopwords/english WARN o.a.s.r.ManagedResource - No stored data found for /schema/analysis/synonyms/english I fixed these by commenting out the managed_en field type in my schema, see https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L486 And now I'm left with: WARN o.a.s.r.ManagedResource - No stored data found for /rest/managed WARN o.a.s.r.ManagedResource - No registered observers for /rest/managed Nobody else gets these warning messages in the logs? If there are migration/upgrade notes I haven't read, a link would be very helpful. Thanks, Marius How can I get rid of these 2? This jira issue is related https://issues.apache.org/jira/browse/SOLR-6128 . Thanks, Marius
Finding Relevant Results
Hi, I developed application to show suggestions. Whenever i search for a query if exact match found in solr it is showing. But if exact matches are not there in solr then it showing irrelevant results. My requirement is to show relevant results to user if the exact match is not found. And if query length is exceeds 25 characters then it is showing irrelevant results. For unmatched results i am using multirequest handler For example if the query is : a b c d then i was splitting it to x1=a+b+cx2=d and querying solr based on x1 and x2 parameters. Anybody tell me how can i show relevant results if exact match not found. Thanks, Kumar -- View this message in context: http://lucene.472066.n3.nabble.com/Finding-Relevant-Results-tp4142221.html Sent from the Solr - User mailing list archive at Nabble.com.
Highlighting search result without using solrnet code with SOLR 4.1
Hi, I want to highlight the search results without using Highlighligting Parameters provided by Solrnet. following is my configuration for highlighting parameters. Here is my Schema.xml field name=guid type=text_en indexed=true stored=true/ field name=title type=text_en indexed=true stored=true field name=link type=text_en indexed=true stored=true/ field name=fulltext type=text_en indexed=true stored=true/ field name=scope type=text_en indexed=true stored=true/ Following is configuration for solrconfig.xml str name=hlon/str str name=hl.flfulltext title/str str name=hl.encoderhtml/str str name =hl.fragListBuildersimple/str str name=hl.simple.prelt;emgt;/str str name=hl.simple.postlt;/emgt;/strs str name=f.title.hl.fragsize0/str str name=f.title.hl.alternateFieldtitle/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldtitle/str str name=f.content.hl.snippets20/str str name=f.content.hl.fragsize2000/str str name=f.content.hl.alternateFieldfulltext/str str name=f.content.hl.maxAlternateFieldLength2000/str str name=hl.fragmenterregex/str When I tried to search result for fulltext:What rules Apply, it is giving me following response for highlighting which is correct. lst name=highlighting lst name=E836D2CC-76EF-4EC2-AD00-00015074537E arr name=fulltext str 3538. emWhat/em emrules/em emapply/em to correction of errors in nonqualified deferred compensation plans/str /arr /lst lst name=63DA3DDB-AAF1-435B-8AA4-00BB60F596A2 arr name=fulltext str 3723. What is a Section 1042 election? emWhat/em emrules/em emapply/em to qualified sales to an ESOP/str /arr /lst /lst I want to highlighted this results in the application. I am using c# language and does not want to do with solrnet DLL. is it possible to show highlighting without code ? Please do needful. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-search-result-without-using-solrnet-code-with-SOLR-4-1-tp414.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Warning message logs on startup after upgrading to 4.8.1
I confess we had upgraded to 4.8.1 and totally missed these warnings! I'm guessing they might be related to the ManagedIndexSchemaFactory stuff, which is commented out in the example configs. We don't use any of the REST stuff ourselves, so I can't comment any further. I think you are ok as long as they are just warnings, hopefully a subsequent release will tidy up the config to make them go away, but if they caused any real problems, I think we'd know about it by now. On 17 June 2014 07:08, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, I think it would attract more attention if title mentions about 'managed resource warn logs' or something like that. AHmet On Tuesday, June 17, 2014 8:12 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: On Thu, Jun 12, 2014 at 11:16 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: Hi guys, After I upgraded to Solr 4.8.1 I got a few warning messages in the log at startup: WARN o.a.s.c.SolrResourceLoader - Solr loaded a deprecated plugin/analysis class [solr.ThaiWordFilterFactory]. Please consult documentation how to replace it accordingly. I fixed this with https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L1159 WARN o.a.s.r.ManagedResource- No stored data found for /schema/analysis/stopwords/english WARN o.a.s.r.ManagedResource- No stored data found for /schema/analysis/synonyms/english I fixed these by commenting out the managed_en field type in my schema, see https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L486 And now I'm left with: WARN o.a.s.r.ManagedResource- No stored data found for /rest/managed WARN o.a.s.r.ManagedResource- No registered observers for /rest/managed Nobody else gets these warning messages in the logs? If there are migration/upgrade notes I haven't read, a link would be very helpful. Thanks, Marius How can I get rid of these 2? This jira issue is related https://issues.apache.org/jira/browse/SOLR-6128 . Thanks, Marius
Re: Warning message logs on startup after upgrading to 4.8.1
On Tue, Jun 17, 2014 at 9:08 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, I think it would attract more attention if title mentions about 'managed resource warn logs' or something like that. Too late for that, unless you suggest opening a new thread :) Thanks for the tip, Marius AHmet On Tuesday, June 17, 2014 8:12 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: On Thu, Jun 12, 2014 at 11:16 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: Hi guys, After I upgraded to Solr 4.8.1 I got a few warning messages in the log at startup: WARN o.a.s.c.SolrResourceLoader - Solr loaded a deprecated plugin/analysis class [solr.ThaiWordFilterFactory]. Please consult documentation how to replace it accordingly. I fixed this with https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L1159 WARN o.a.s.r.ManagedResource- No stored data found for /schema/analysis/stopwords/english WARN o.a.s.r.ManagedResource- No stored data found for /schema/analysis/synonyms/english I fixed these by commenting out the managed_en field type in my schema, see https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L486 And now I'm left with: WARN o.a.s.r.ManagedResource- No stored data found for /rest/managed WARN o.a.s.r.ManagedResource- No registered observers for /rest/managed Nobody else gets these warning messages in the logs? If there are migration/upgrade notes I haven't read, a link would be very helpful. Thanks, Marius How can I get rid of these 2? This jira issue is related https://issues.apache.org/jira/browse/SOLR-6128 . Thanks, Marius
Re: Warning message logs on startup after upgrading to 4.8.1
On Tue, Jun 17, 2014 at 10:37 AM, Daniel Collins danwcoll...@gmail.com wrote: I confess we had upgraded to 4.8.1 and totally missed these warnings! I'm guessing they might be related to the ManagedIndexSchemaFactory stuff, which is commented out in the example configs. We don't use any of the REST stuff ourselves, so I can't comment any further. I think you are ok as long as they are just warnings, hopefully a subsequent release will tidy up the config to make them go away, but if they caused any real problems, I think we'd know about it by now. I haven't noticed any problems indeed. Still, these warning messages are a bit annoying. My main concern is that the next time there is a problem with the product I'm working on (see http://jira.xwiki.org/browse/XWIKI-10379 ) someone may loose time digging in the wrong place due to these warning messages being present in the logs. Thanks, Marius On 17 June 2014 07:08, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, I think it would attract more attention if title mentions about 'managed resource warn logs' or something like that. AHmet On Tuesday, June 17, 2014 8:12 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: On Thu, Jun 12, 2014 at 11:16 AM, Marius Dumitru Florea mariusdumitru.flo...@xwiki.com wrote: Hi guys, After I upgraded to Solr 4.8.1 I got a few warning messages in the log at startup: WARN o.a.s.c.SolrResourceLoader - Solr loaded a deprecated plugin/analysis class [solr.ThaiWordFilterFactory]. Please consult documentation how to replace it accordingly. I fixed this with https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L1159 WARN o.a.s.r.ManagedResource- No stored data found for /schema/analysis/stopwords/english WARN o.a.s.r.ManagedResource- No stored data found for /schema/analysis/synonyms/english I fixed these by commenting out the managed_en field type in my schema, see https://github.com/xwiki/xwiki-platform/commit/d41580c383f40d2aa4e4f551971418536a3f3a20#diff-44d79e64e45f3b05115aebcd714bd897L486 And now I'm left with: WARN o.a.s.r.ManagedResource- No stored data found for /rest/managed WARN o.a.s.r.ManagedResource- No registered observers for /rest/managed Nobody else gets these warning messages in the logs? If there are migration/upgrade notes I haven't read, a link would be very helpful. Thanks, Marius How can I get rid of these 2? This jira issue is related https://issues.apache.org/jira/browse/SOLR-6128 . Thanks, Marius
Document security filtering in distributed solr (with multi shard)
Dears, Hi, I am going to apply customer security filtering for each document per each user. (using custom profile for each user). I was thinking of adding user fields to index and using solr join for filtering. But It seems for distributed solr this is not a solution. Could you please tell me what the solution would be in this case? Best regards. -- A.Nazemian
Re: Document security filtering in distributed solr (with multi shard)
Have you looked at Post Filters? I think this was one of the use cases. An old article: http://java.dzone.com/articles/custom-security-filtering-solr . Google search should bring a couple more. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Jun 17, 2014 at 6:24 PM, Ali Nazemian alinazem...@gmail.com wrote: Dears, Hi, I am going to apply customer security filtering for each document per each user. (using custom profile for each user). I was thinking of adding user fields to index and using solr join for filtering. But It seems for distributed solr this is not a solution. Could you please tell me what the solution would be in this case? Best regards. -- A.Nazemian
RE: group.ngroups is set to an incorrect value - specific field types
Hi all Could anyone have comments on my bug report? Regards, Ebisawa -Original Message- From: 海老澤 志信 Sent: Friday, June 13, 2014 7:45 PM To: 'solr-user@lucene.apache.org' Subject: group.ngroups is set to an incorrect value - specific field types Hi, I'm using Solr version 4.1. I found a bug in group.ngroups. So could anyone kindly take a look at my bug report? If I specify the type Double as group.field, the value of group.ngroups is set to be an incorrect value. [condition] - Double is defined in group.field - Documents without the field which is defined as group.field, [Sample query and Example] --- solr/select?q=*:*group=truegroup.ngroups=truegroup.field=Double_Fiel d * Double_Field is defined solr.TrieDoubleField type. --- When documents with group.field are 4 and documents without group.field are 6, then it turns out 10 of group.ngroups as result of the query. But I think that group.ngroups should be 5 rightly in this case. [Root Cause] It seems there is a bug in the source code of Lucene. There is a function that compares a list of whether these groups contain the same group.field, It calls MutableValueDouble.compareSameType(). See below the point which seems to be a root cause. - if (!exists) return -1; if (!b.exists) return 1; - If exists is false, it return -1. But I think it should return 0, when exists and b.exists are equal. [Similar problem] There is a similar problem to MutableValueBool.compareSameType(). Therefore, when you grouping the field of type Boolean (solr.BoolField), value of group.ngroups is always 0 or 1 . [Solution] I propose the following modifications: MutableValueDouble.compareSameType() === --- MutableValueDouble.java +++ MutableValueDouble.java @@ -54,9 +54,8 @@ MutableValueDouble b = (MutableValueDouble)other; int c = Double.compare(value, b.value); if (c != 0) return c; -if (!exists) return -1; -if (!b.exists) return 1; -return 0; +if (exists == b.exists) return 0; +return exists ? 1 : -1; } === I propose the following modifications: MutableValueBool.compareSameType() === --- MutableValueBool.java +++ MutableValueBool.java @@ -52,7 +52,7 @@ @Override public int compareSameType(Object other) { MutableValueBool b = (MutableValueBool)other; -if (value != b.value) return value ? 1 : 0; +if (value != b.value) return value ? 1 : -1; if (exists == b.exists) return 0; return exists ? 1 : -1; } === Thanks, Ebisawa
Re: Searching for string having apostrophe
On 6/16/2014 11:34 PM, Gaurav Deshpande wrote: I want to perform name searches in Solr on String and text datatypes but names contain apostrophes in it. Is there a way I can escape these apostrophes and perform searches ? Using '\' before apostrophe results in forbidden access due to cross site scripting attacks. Any help or pointers regarding this would be appreciated. If I start up the Solr example from branch_4x and send the following query: text:\' Solr will not complain at all. I would bet that you have a proxy or a load balancer in front of Solr that is treating your input as a possible attack and denying it. You would need to look into the configuration of the proxy or load balancer to fix this. It is entirely possible that you have a proxy on your network that you are not aware of. Sometimes large companies will send *all* user-generated network traffic through a proxy to check for viruses, trojans, and illegal/objectionable network usage. Thanks, Shawn
Re: Document security filtering in distributed solr (with multi shard)
Dear Alexandre, Yeah I saw that, but what is the best way of doing that from the performance point of view? I think of one solution myself: Suppose we have a RDBMS for users that contains the category and group for each user. (It could be in hierarchical format) Suppose there is a field name security in solr index that contains the list of each group or category that is applied to each document. So the query would be filter only documents that its category or group match the specific one for that user. Is this solution works in distributed way? What if we concern about performance? Also I was wondering how lucidworks do that? Best regards. On Tue, Jun 17, 2014 at 4:08 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Have you looked at Post Filters? I think this was one of the use cases. An old article: http://java.dzone.com/articles/custom-security-filtering-solr . Google search should bring a couple more. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Jun 17, 2014 at 6:24 PM, Ali Nazemian alinazem...@gmail.com wrote: Dears, Hi, I am going to apply customer security filtering for each document per each user. (using custom profile for each user). I was thinking of adding user fields to index and using solr join for filtering. But It seems for distributed solr this is not a solution. Could you please tell me what the solution would be in this case? Best regards. -- A.Nazemian -- A.Nazemian
docFreq coming to be more than 1 for unique id field
Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva
RE: docFreq coming to be more than 1 for unique id field
Hi - did you perhaps update on of those documents? -Original message- From:Apoorva Gaurav apoorva.gau...@myntra.com Sent: Tuesday 17th June 2014 16:58 To: solr-user@lucene.apache.org Subject: docFreq coming to be more than 1 for unique id field Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva
Re: docFreq coming to be more than 1 for unique id field
Hi, Just a guess, do you have deletions? What happens when you optimize and re-try? On Tuesday, June 17, 2014 5:58 PM, Apoorva Gaurav apoorva.gau...@myntra.com wrote: Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva
Re: docFreq coming to be more than 1 for unique id field
Yes we have updates on these. Didn't try optimizing will do. But isn't the unique field supposed to be unique? On Tue, Jun 17, 2014 at 8:37 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Just a guess, do you have deletions? What happens when you optimize and re-try? On Tuesday, June 17, 2014 5:58 PM, Apoorva Gaurav apoorva.gau...@myntra.com wrote: Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva -- Thanks Regards, Apoorva
RE: docFreq coming to be more than 1 for unique id field
Yes, it is unique but they are not immediately purged, only when `optimized` or forceMerge or during regular segment merges. The problem is that they keep messing with the statistics. -Original message- From:Apoorva Gaurav apoorva.gau...@myntra.com Sent: Tuesday 17th June 2014 17:16 To: solr-user solr-user@lucene.apache.org; Ahmet Arslan iori...@yahoo.com Subject: Re: docFreq coming to be more than 1 for unique id field Yes we have updates on these. Didn't try optimizing will do. But isn't the unique field supposed to be unique? On Tue, Jun 17, 2014 at 8:37 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Just a guess, do you have deletions? What happens when you optimize and re-try? On Tuesday, June 17, 2014 5:58 PM, Apoorva Gaurav apoorva.gau...@myntra.com wrote: Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva -- Thanks Regards, Apoorva
Re: docFreq coming to be more than 1 for unique id field
Will try optimizing and then respond to the thread. On Tue, Jun 17, 2014 at 8:47 PM, Markus Jelsma markus.jel...@openindex.io wrote: Yes, it is unique but they are not immediately purged, only when `optimized` or forceMerge or during regular segment merges. The problem is that they keep messing with the statistics. -Original message- From:Apoorva Gaurav apoorva.gau...@myntra.com Sent: Tuesday 17th June 2014 17:16 To: solr-user solr-user@lucene.apache.org; Ahmet Arslan iori...@yahoo.com Subject: Re: docFreq coming to be more than 1 for unique id field Yes we have updates on these. Didn't try optimizing will do. But isn't the unique field supposed to be unique? On Tue, Jun 17, 2014 at 8:37 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Just a guess, do you have deletions? What happens when you optimize and re-try? On Tuesday, June 17, 2014 5:58 PM, Apoorva Gaurav apoorva.gau...@myntra.com wrote: Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva -- Thanks Regards, Apoorva -- Thanks Regards, Apoorva
Re: docFreq coming to be more than 1 for unique id field
Personally, although I understand the rationale and performance ramifications of the current approach of including deleted documents, I would agree that DF and IDF should definitely be accurate, despite deletions. So, if they aren't, I'd suggest filing a bug Jira. Granted it might be rejected as by design or won't fix or improvement, but it's worth having the discussion. Maybe one theory from the old days is that the model of batch update would by definition include an optimize step. But now with Solr considered by some to be a NoSQL database and with (near) real-time updates, that model is clearly obsolete. -- Jack Krupansky -Original Message- From: Apoorva Gaurav Sent: Tuesday, June 17, 2014 11:15 AM To: solr-user ; Ahmet Arslan Subject: Re: docFreq coming to be more than 1 for unique id field Yes we have updates on these. Didn't try optimizing will do. But isn't the unique field supposed to be unique? On Tue, Jun 17, 2014 at 8:37 PM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Just a guess, do you have deletions? What happens when you optimize and re-try? On Tuesday, June 17, 2014 5:58 PM, Apoorva Gaurav apoorva.gau...@myntra.com wrote: Hello All, We are using solr 4.4.0. We have a uniqueKey of type solr.StrField. We need to extract docs in a pre-defined order if they match a certain condition. Our query is of the format uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) where weight1 weight2 weightN But the result is not in the desired order. On debugging the query we've found out that for some of the documents docFreq is higher than 1 and hence their tf-idf based score is less than others. What can be the reason behind a unique id field having docFreq greater than 1? How can we prevent it? -- Thanks Regards, Apoorva -- Thanks Regards, Apoorva
Re: Searching for string having apostrophe
I really have to ask why you want to search for apostrophes. Usually these are considered junk characters and are best ignored. Best, Erick On Tue, Jun 17, 2014 at 6:03 AM, Shawn Heisey s...@elyograg.org wrote: On 6/16/2014 11:34 PM, Gaurav Deshpande wrote: I want to perform name searches in Solr on String and text datatypes but names contain apostrophes in it. Is there a way I can escape these apostrophes and perform searches ? Using '\' before apostrophe results in forbidden access due to cross site scripting attacks. Any help or pointers regarding this would be appreciated. If I start up the Solr example from branch_4x and send the following query: text:\' Solr will not complain at all. I would bet that you have a proxy or a load balancer in front of Solr that is treating your input as a possible attack and denying it. You would need to look into the configuration of the proxy or load balancer to fix this. It is entirely possible that you have a proxy on your network that you are not aware of. Sometimes large companies will send *all* user-generated network traffic through a proxy to check for viruses, trojans, and illegal/objectionable network usage. Thanks, Shawn
solrj error
Hi I am using solrj 4.6 for accessing solr 4.6.As a test case for my application, I created a servlet which holds the SolrJ connection via zookeeper. When I run the test, I am getting a weird stack trace. The test fails on not finding a currency file of java. This file I believe used to be present in java 1.6. Is somehow solrj 4.6 coupled with java 1.6? Any other ideas? Caused by: java.lang.InternalError at java.util.Currency$1.run(Currency.java:224) at java.security.AccessController.doPrivileged(Native Method) at java.util.Currency.clinit(Currency.java:192) at java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols .java:585) at java.text.DecimalFormatSymbols.init(DecimalFormatSymbols. java:94) at java.text.DecimalFormatSymbols.getInstance( DecimalFormatSymbols.java:157) at java.text.NumberFormat.getInstance(NumberFormat.java:767) at java.text.NumberFormat.getIntegerInstance(NumberFormat.java: 439) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java: 664) at java.text.SimpleDateFormat.init(SimpleDateFormat.java:585) at org.apache.solr.common.util.DateUtil$ThreadLocalDateFormat. init(DateUtil.java:187) at org.apache.solr.common.util.DateUtil.clinit(DateUtil.java: 179) at org.apache.solr.client.solrj.util.ClientUtils.clinit( ClientUtils.java:193) at org.apache.solr.client.solrj.impl.CloudSolrServer.request( CloudSolrServer.java:565) at org.apache.solr.client.solrj.request.QueryRequest.process( QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java: 310) at com.qbase.gsn.SearchServlet.doGet(SearchServlet.java:121) ... 21 more Caused by: java.io.FileNotFoundException: /opt/jdk1.7.0_25/lib/currency. data (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:138) at java.io.FileInputStream.init(FileInputStream.java:97) at java.util.Currency$1.run(Currency.java:198) ... 37 more Thanks Vivek P.S. : I tried to force /opt/jdk1.7 to be java.home thinking the execution path will change but the bug remained. Also there is no java 1.6 on the machine
Re: solrj error
Clearly you're going to need to deposit 25 cents to make that call. :) More seriously, I'm wondering if most of the issue is environment-related, since it seems like it's looking for that file on your system based on the path. I checked my machine and it doesn't have a $JAVA_HOME/lib/currency.data file either. Is it possible that you have somehow used a mismatched JAVA_HOME and tools.jar somehow? Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Tue, Jun 17, 2014 at 12:03 PM, Vivek Pathak vpat...@orgmeta.com wrote: Hi I am using solrj 4.6 for accessing solr 4.6.As a test case for my application, I created a servlet which holds the SolrJ connection via zookeeper. When I run the test, I am getting a weird stack trace. The test fails on not finding a currency file of java. This file I believe used to be present in java 1.6. Is somehow solrj 4.6 coupled with java 1.6? Any other ideas? Caused by: java.lang.InternalError at java.util.Currency$1.run(Currency.java:224) at java.security.AccessController.doPrivileged(Native Method) at java.util.Currency.clinit(Currency.java:192) at java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols .java:585) at java.text.DecimalFormatSymbols.init(DecimalFormatSymbols. java:94) at java.text.DecimalFormatSymbols.getInstance( DecimalFormatSymbols.java:157) at java.text.NumberFormat.getInstance(NumberFormat.java:767) at java.text.NumberFormat.getIntegerInstance(NumberFormat.java: 439) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java: 664) at java.text.SimpleDateFormat.init(SimpleDateFormat.java:585) at org.apache.solr.common.util.DateUtil$ThreadLocalDateFormat. init(DateUtil.java:187) at org.apache.solr.common.util.DateUtil.clinit(DateUtil.java: 179) at org.apache.solr.client.solrj.util.ClientUtils.clinit( ClientUtils.java:193) at org.apache.solr.client.solrj.impl.CloudSolrServer.request( CloudSolrServer.java:565) at org.apache.solr.client.solrj.request.QueryRequest.process( QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java: 310) at com.qbase.gsn.SearchServlet.doGet(SearchServlet.java:121) ... 21 more Caused by: java.io.FileNotFoundException: /opt/jdk1.7.0_25/lib/currency. data (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:138) at java.io.FileInputStream.init(FileInputStream.java:97) at java.util.Currency$1.run(Currency.java:198) ... 37 more Thanks Vivek P.S. : I tried to force /opt/jdk1.7 to be java.home thinking the execution path will change but the bug remained. Also there is no java 1.6 on the machine
Re: solrj error
Thanks Michael. This indeed had something to do with environment. mvn test is running the test case without properly initialized environment. Once I added System.setProperty( java.home , /opt/jdk.../jre/ ) ; - it started found the currency.data and moved on So it is clearly not solr - but if you have an idea of how to properly init the system properties in mvn test - then you can surely point out. Thanks... On Tue, Jun 17, 2014 at 12:16 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: Clearly you're going to need to deposit 25 cents to make that call. :) More seriously, I'm wondering if most of the issue is environment-related, since it seems like it's looking for that file on your system based on the path. I checked my machine and it doesn't have a $JAVA_HOME/lib/currency.data file either. Is it possible that you have somehow used a mismatched JAVA_HOME and tools.jar somehow? Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Tue, Jun 17, 2014 at 12:03 PM, Vivek Pathak vpat...@orgmeta.com wrote: Hi I am using solrj 4.6 for accessing solr 4.6.As a test case for my application, I created a servlet which holds the SolrJ connection via zookeeper. When I run the test, I am getting a weird stack trace. The test fails on not finding a currency file of java. This file I believe used to be present in java 1.6. Is somehow solrj 4.6 coupled with java 1.6? Any other ideas? Caused by: java.lang.InternalError at java.util.Currency$1.run(Currency.java:224) at java.security.AccessController.doPrivileged(Native Method) at java.util.Currency.clinit(Currency.java:192) at java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols .java:585) at java.text.DecimalFormatSymbols.init(DecimalFormatSymbols. java:94) at java.text.DecimalFormatSymbols.getInstance( DecimalFormatSymbols.java:157) at java.text.NumberFormat.getInstance(NumberFormat.java:767) at java.text.NumberFormat.getIntegerInstance(NumberFormat.java: 439) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java: 664) at java.text.SimpleDateFormat.init(SimpleDateFormat.java:585) at org.apache.solr.common.util.DateUtil$ThreadLocalDateFormat. init(DateUtil.java:187) at org.apache.solr.common.util.DateUtil.clinit(DateUtil.java: 179) at org.apache.solr.client.solrj.util.ClientUtils.clinit( ClientUtils.java:193) at org.apache.solr.client.solrj.impl.CloudSolrServer.request( CloudSolrServer.java:565) at org.apache.solr.client.solrj.request.QueryRequest.process( QueryRequest.java:90) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java: 310) at com.qbase.gsn.SearchServlet.doGet(SearchServlet.java:121) ... 21 more Caused by: java.io.FileNotFoundException: /opt/jdk1.7.0_25/lib/currency. data (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.init(FileInputStream.java:138) at java.io.FileInputStream.init(FileInputStream.java:97) at java.util.Currency$1.run(Currency.java:198) ... 37 more Thanks Vivek P.S. : I tried to force /opt/jdk1.7 to be java.home thinking the execution path will change but the bug remained. Also there is no java 1.6 on the machine
Re: docFreq coming to be more than 1 for unique id field
All index wide statistics (like the docFreq of each term) are over the entire index, which includes deleted docs -- because it's an *inverted* index, it's not feasible to update those statistics to account for deleted docs (that would basically kill all the performance advantages thatcome from having an inverted index. : uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) : where weight1 weight2 weightN : : But the result is not in the desired order. On debugging the query we've if you are requesting a small number of docs, and all the docs you are requesting are returned in a single request, why do you care what order they are in? why not just put them in hte order you want on the client. That would not only make your solr request simpler, but would almost certainly be a bit *faster* since you could sort exactly as you wnated w/o needing to compute a complex score that you don't actaully care about. -Hoss http://www.lucidworks.com/
Re: group.ngroups is set to an incorrect value - specific field types
Hi, I see similar problem in our solr application. Sometime it gives number in a group as number of all documents. This starting to happen after upgrade from 4.6.1 to 4.8.1 Thanks. Alex. -Original Message- From: 海老澤 志信 shinobu_ebis...@waku-2.com To: solr-user solr-user@lucene.apache.org Sent: Tue, Jun 17, 2014 5:24 am Subject: RE: group.ngroups is set to an incorrect value - specific field types Hi all Could anyone have comments on my bug report? Regards, Ebisawa -Original Message- From: 海老澤 志信 Sent: Friday, June 13, 2014 7:45 PM To: 'solr-user@lucene.apache.org' Subject: group.ngroups is set to an incorrect value - specific field types Hi, I'm using Solr version 4.1. I found a bug in group.ngroups. So could anyone kindly take a look at my bug report? If I specify the type Double as group.field, the value of group.ngroups is set to be an incorrect value. [condition] - Double is defined in group.field - Documents without the field which is defined as group.field, [Sample query and Example] --- solr/select?q=*:*group=truegroup.ngroups=truegroup.field=Double_Fiel d * Double_Field is defined solr.TrieDoubleField type. --- When documents with group.field are 4 and documents without group.field are 6, then it turns out 10 of group.ngroups as result of the query. But I think that group.ngroups should be 5 rightly in this case. [Root Cause] It seems there is a bug in the source code of Lucene. There is a function that compares a list of whether these groups contain the same group.field, It calls MutableValueDouble.compareSameType(). See below the point which seems to be a root cause. - if (!exists) return -1; if (!b.exists) return 1; - If exists is false, it return -1. But I think it should return 0, when exists and b.exists are equal. [Similar problem] There is a similar problem to MutableValueBool.compareSameType(). Therefore, when you grouping the field of type Boolean (solr.BoolField), value of group.ngroups is always 0 or 1 . [Solution] I propose the following modifications: MutableValueDouble.compareSameType() === --- MutableValueDouble.java +++ MutableValueDouble.java @@ -54,9 +54,8 @@ MutableValueDouble b = (MutableValueDouble)other; int c = Double.compare(value, b.value); if (c != 0) return c; -if (!exists) return -1; -if (!b.exists) return 1; -return 0; +if (exists == b.exists) return 0; +return exists ? 1 : -1; } === I propose the following modifications: MutableValueBool.compareSameType() === --- MutableValueBool.java +++ MutableValueBool.java @@ -52,7 +52,7 @@ @Override public int compareSameType(Object other) { MutableValueBool b = (MutableValueBool)other; -if (value != b.value) return value ? 1 : 0; +if (value != b.value) return value ? 1 : -1; if (exists == b.exists) return 0; return exists ? 1 : -1; } === Thanks, Ebisawa
Re: docFreq coming to be more than 1 for unique id field
Currently we are not using SolrJ but are simply interacting with solr with json over http, this will change in a couple of months but currently not there. As of now we are putting all the logic in query building, using it to query solr and then passing on the json returned by it to front end. I know this is not the ideal approach, but that's what we have at the moment. Hence need a way of deterministically order the result set provided they match other search criteria. On Tue, Jun 17, 2014 at 10:28 PM, Chris Hostetter hossman_luc...@fucit.org wrote: All index wide statistics (like the docFreq of each term) are over the entire index, which includes deleted docs -- because it's an *inverted* index, it's not feasible to update those statistics to account for deleted docs (that would basically kill all the performance advantages thatcome from having an inverted index. : uniqueField:(id1 ^ weight1 OR id2 ^ weight2 . OR idN ^ weightN) : where weight1 weight2 weightN : : But the result is not in the desired order. On debugging the query we've if you are requesting a small number of docs, and all the docs you are requesting are returned in a single request, why do you care what order they are in? why not just put them in hte order you want on the client. That would not only make your solr request simpler, but would almost certainly be a bit *faster* since you could sort exactly as you wnated w/o needing to compute a complex score that you don't actaully care about. -Hoss http://www.lucidworks.com/ -- Thanks Regards, Apoorva
Creating new replicas, replication reports false positive success
I have a large SolrCloud collection that I'm trying to add replicas for existing shards. I've tried both the Collections API via ADDREPLICA: curl http://collection1-2d.i.corp:8983/solr/admin/collections?action=ADDREPLICAcollection=insights1shard=1402358400node=collection1-2d.i.corp:8983_solr I've also tried adding a replica via the Core Admin API, though I can't find documentation on the CoreAdmin call I'm using: curl http://collection1-2d.i.corp:8983/solr/admin/cores?action=CREATEname=collection1_1402358400_replica1collection=collection1shard=1402358400 Logs are in the gist below, as well as clusterstate for the shard in question, which describe what I see via the UI also -- the newly created replica shard erroneously thinks it has fully replicated. https://gist.github.com/ralph-tice/18796de6393f48fb0192 The logs are after issuing a REQUESTRECOVERY call. The only message on the leader after that call is: [qtp1728790703-173] org.apache.solr.handler.admin.CoreAdminHandler Leader collection1_1402358400_replica1 ignoring request to be in the recovering state because it is live and active. Thanks for any help or insight! Let me know if any further information is required. --Ralph
Re: docFreq coming to be more than 1 for unique id field
: Currently we are not using SolrJ but are simply interacting with solr with : json over http, this will change in a couple of months but currently not : there. As of now we are putting all the logic in query building, using it : to query solr and then passing on the json returned by it to front end. I : know this is not the ideal approach, but that's what we have at the moment. : Hence need a way of deterministically order the result set provided they : match other search criteria. wether you are using SOlrJ or not doesn't really change my point at all -- you are jumping though all sorts of hoops, and asking solr to jump through all sorts of hoops, for a score you don't actaully care about, and isn't going ot work perfectly for what you want anyway because of the fundemental nature of the inverted index stats, leading you to look for even smaller, higher, hoops to try and jump through. it would be far simpler to just ask for the exact set of N documents you wnat from Solr in default order, re-order the resulting documents in the magic order you already know and care about, and then give that modified response to your front end. -Hoss http://www.lucidworks.com/
Re: docFreq coming to be more than 1 for unique id field
OK lets for a moment forget about this specific use case and consider a more general case. Lets say the field name is keywords are we are storing text in it, query is of the type keywords:(word1 OR word2 ... OR wordN). The client is relying on default relevancy based sort returned by solr. Some documents can get penalised because of some other documents which were deleted. Is this functionality correct? On Wed, Jun 18, 2014 at 12:52 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : Currently we are not using SolrJ but are simply interacting with solr with : json over http, this will change in a couple of months but currently not : there. As of now we are putting all the logic in query building, using it : to query solr and then passing on the json returned by it to front end. I : know this is not the ideal approach, but that's what we have at the moment. : Hence need a way of deterministically order the result set provided they : match other search criteria. wether you are using SOlrJ or not doesn't really change my point at all -- you are jumping though all sorts of hoops, and asking solr to jump through all sorts of hoops, for a score you don't actaully care about, and isn't going ot work perfectly for what you want anyway because of the fundemental nature of the inverted index stats, leading you to look for even smaller, higher, hoops to try and jump through. it would be far simpler to just ask for the exact set of N documents you wnat from Solr in default order, re-order the resulting documents in the magic order you already know and care about, and then give that modified response to your front end. -Hoss http://www.lucidworks.com/ -- Thanks Regards, Apoorva
Re: docFreq coming to be more than 1 for unique id field
: text in it, query is of the type keywords:(word1 OR word2 ... OR wordN). : The client is relying on default relevancy based sort returned by solr. : Some documents can get penalised because of some other documents which were : deleted. Is this functionality correct? yes, because term stats are over the entire index including deleted documents still in segments -- information about deletions isn't purged from the index until a segment is merged and the stats are recomputed over the docs/terms in the new segment. the only way to get those types of statistics at request time such that they were *not* afected by deleted documents would involve scanning every doc to compute them -- which would defeat the point of having the inverted index. -Hoss http://www.lucidworks.com/
Spell checker - limit on number of misspelt words in a search term.
Hi All, I am using the Direct Spell checker component and I have collate =true in my solrconfig.xml. The issue that I noticed is that , when I have a search term with upto two words in it and if both of them are misspelled I get a collation query as a suggestion in the spellchecker output, if I increase the search term length to 3 words and spell all of them incorrectly then I do not get a collation query as an output in the spell checker suggestions. Is there a setting in solrconfig.xml file that's controlling this behavior by restricting the length of the search term to be up to two misspelt words to suggest a collation query, if so I would need to change the property. Can anyone please let me know how to do so ? Thanks. Sent from my mobile.