Re: how to avoid duplicates in search results?
You can probably use the Grouping feature: http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters There is also a Document Duplicate Detection at index time: http://wiki.apache.org/solr/Deduplication On Tue, Oct 4, 2011 at 9:55 AM, nagarjuna wrote: > Hi everybody > i got the following response > > > - > - > 0 > 0 > - > groups > on > 0 > participate > 2.2 > 30 > > > - > - > testing group > testing group > name="url">http://abc.xyz.com/groups/testing-group/discussions/62 > > - > testing group > testing group > name="url">http://abc.xyz.com/groups/testing-group/discussions/62 > > > > > > i need to remove the duplicte results > > can anyone give me suggestions > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/how-to-avoid-duplicates-in-search-results-tp3392524p3392524.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Problem with Filter Query
As far as i know if you add multiple FQs they will be joined always with AND. You can do something like fq={!q.op=OR df=supplierName}first second third ... HTH Edo On Thu, Jul 14, 2011 at 3:50 PM, Kissue Kissue wrote: > No its not a multivalue field. Yes i can see that it looks like its doing > an > AND on all the filter values but how can i get it to do an OR? > I just want it to return documents that have any of the supplied values as > their supplier name. > > I have also tried: solrQuery.addFilterQuery(arrayOfSupplierNames) and i get > no results too. > > Thanks. > > On Thu, Jul 14, 2011 at 3:06 PM, Edoardo Tosca >wrote: > > > So with > > &fq=supplierName:first&fq=supplierName:second > > you don't get any results? > > > > is this field a multivalue? > > Mutliple FQs are evaluated as AND > > so your document must have in supplierName both "first" and "second" > > > > Edo > > > > > > On Thu, Jul 14, 2011 at 3:00 PM, Kissue Kissue > > wrote: > > > > > Thanks for your response. > > > > > > Actually the elements are composed as follows: > > > &fq=first&fq=second > > > > > > But using Solr admin query screen i have modified the query to: > > > &fq=supplierName:first&fq=supplierName:second.... > > > i still get the same results. > > > > > > I will try to use solrQuery.addFilterQuery(arrayOfSupplierNames) like > you > > > suggested and see how it goes. > > > > > > Thanks. > > > > > > > > > On Thu, Jul 14, 2011 at 2:49 PM, Edoardo Tosca < > e.to...@sourcesense.com > > > >wrote: > > > > > > > Hi, > > > > have you tried with: > > > > solrQuery.addFilterQuery(arrayOfSupplierNames) ? > > > > > > > > other question, is every element of your array composed in this way: > > > > supplierName:FIRST > > > > supplierName:SECOND > > > > etc.. > > > > > > > > HTH > > > > edo > > > > > > > > On Thu, Jul 14, 2011 at 2:18 PM, Kissue Kissue > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > I am using Solr 3.1 with SolrJ. I have a field called supplierName > in > > > my > > > > > index which i am trying to do filtering on. When i select about 5 > > > > suppliers > > > > > to filter on at the same time and use their supplier name to > contruct > > a > > > > > filter query i do not get any results but when i filter which each > > > > > individual supplier name i get the required results. > > > > > > > > > > Here is the line code to that i used to contruct the filter query: > > > > > > > > > > *solrQuery.setParam("fq", arrayOfSupplierNames); > > > > > > > > > > *The supplier name field is stored as a string in the index and > here > > is > > > > the > > > > > config for the string type from my schema.xml file: > > > > > > > > > > > > > > > > > sortMissingLast="true" > > > > > omitNorms="true"/> > > > > > > > > > > Any help why this is happening will be much appreciated. > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > -- > > > > Edoardo Tosca > > > > Sourcesense - making sense of Open Source: > http://www.sourcesense.com > > > > > > > > > > > > > > > -- > > Edoardo Tosca > > Sourcesense - making sense of Open Source: http://www.sourcesense.com > > > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Upgrading solr from 1.4 to latest version
I think that at the moment there isn't any Ubuntu package available with solr 3.x My suggestion is to uninstall it (via apt-get) and "install" solr manually in your /opt or wherever you want After all what you have to do is easily extract the zipped archive. Edo On Wed, Jul 13, 2011 at 1:35 AM, rvidela wrote: > Hi, > > I am new to Solr. In little time, I am very much impressed with its search > performance. I have installed Solr on Ubuntu using "*apt-get install > solr-tomcat curl -y*" command. From admin page, I can see that solr version > is 1.4.1. But i see there is 3.x version already available. Just wondering > if there is any easy way to upgrade it to latest version. > > Tried specifying version number in apt-get, But it does not work. > Appreciate > your help. > > Thanks > Ravi > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Upgrading-solr-from-1-4-to-latest-version-tp3164312p3164312.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Problem with Filter Query
So with &fq=supplierName:first&fq=supplierName:second you don't get any results? is this field a multivalue? Mutliple FQs are evaluated as AND so your document must have in supplierName both "first" and "second" Edo On Thu, Jul 14, 2011 at 3:00 PM, Kissue Kissue wrote: > Thanks for your response. > > Actually the elements are composed as follows: > &fq=first&fq=second > > But using Solr admin query screen i have modified the query to: > &fq=supplierName:first&fq=supplierName:second > i still get the same results. > > I will try to use solrQuery.addFilterQuery(arrayOfSupplierNames) like you > suggested and see how it goes. > > Thanks. > > > On Thu, Jul 14, 2011 at 2:49 PM, Edoardo Tosca >wrote: > > > Hi, > > have you tried with: > > solrQuery.addFilterQuery(arrayOfSupplierNames) ? > > > > other question, is every element of your array composed in this way: > > supplierName:FIRST > > supplierName:SECOND > > etc.. > > > > HTH > > edo > > > > On Thu, Jul 14, 2011 at 2:18 PM, Kissue Kissue > > wrote: > > > > > Hi, > > > > > > I am using Solr 3.1 with SolrJ. I have a field called supplierName in > my > > > index which i am trying to do filtering on. When i select about 5 > > suppliers > > > to filter on at the same time and use their supplier name to contruct a > > > filter query i do not get any results but when i filter which each > > > individual supplier name i get the required results. > > > > > > Here is the line code to that i used to contruct the filter query: > > > > > > *solrQuery.setParam("fq", arrayOfSupplierNames); > > > > > > *The supplier name field is stored as a string in the index and here is > > the > > > config for the string type from my schema.xml file: > > > > > > > > > sortMissingLast="true" > > > omitNorms="true"/> > > > > > > Any help why this is happening will be much appreciated. > > > > > > Thanks. > > > > > > > > > > > -- > > Edoardo Tosca > > Sourcesense - making sense of Open Source: http://www.sourcesense.com > > > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Problem with Filter Query
Hi, have you tried with: solrQuery.addFilterQuery(arrayOfSupplierNames) ? other question, is every element of your array composed in this way: supplierName:FIRST supplierName:SECOND etc.. HTH edo On Thu, Jul 14, 2011 at 2:18 PM, Kissue Kissue wrote: > Hi, > > I am using Solr 3.1 with SolrJ. I have a field called supplierName in my > index which i am trying to do filtering on. When i select about 5 suppliers > to filter on at the same time and use their supplier name to contruct a > filter query i do not get any results but when i filter which each > individual supplier name i get the required results. > > Here is the line code to that i used to contruct the filter query: > > *solrQuery.setParam("fq", arrayOfSupplierNames); > > *The supplier name field is stored as a string in the index and here is the > config for the string type from my schema.xml file: > > > omitNorms="true"/> > > Any help why this is happening will be much appreciated. > > Thanks. > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: about standardAnaylzer in solr
Try to change from StandardTolkenizerFactory to ClassicTokenizerFactory or create your own fieldType ** ... Edo On Wed, Jul 13, 2011 at 3:40 PM, Kiwi de coder wrote: > hi, > > I using solr 3.3 which in schema.xml contain this : > > > > > > i use the sentences as example "XY&Z Corporation - x...@example.com" > > however, when I try on /analysis.jsp, it show difference result compare to > using Lucene. > > using solr I got result below when using "text_standard" and "text_general" > (is both the same ?) > > XYZCorporationxyzexample.com (which all belong to > > ) > > when using Lucene, i got this > > StandardAnalyzer: > > 1: [xy&z:0->4:] > 2: [corporation:5->16:] > 3: [x...@example.com:19->34:] > > > so my question is, how to make it analysis like in Lucene ? > > regards, > kiwi > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Multiple indexes
Try to use multiple cores: http://wiki.apache.org/solr/CoreAdmin On Wed, Jun 15, 2011 at 5:55 PM, shacky wrote: > Hi. > > How to have multiple indexes in SOLR, with different fields and > different types of data? > > Thank you very much! > Bye. > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: AlternateDistributedMLT.patch not working
Hi all, I am currently working on this AlternateDistributedMLT patch. I've applied it manually on solr 1.4 an solved some Null Pointer Exception issues. It's now working properly. But I'm not sure about its behaviour so i'll ask you, list: I saw that every MLT query for a doc that is in the resultset runs only on its shard (the one where the doc is in the index). This means that you can miss documents, probably related to the doc but not retrieved because they belong to other shards. Does it make sense? Is it the expected behavoiur? If it is, i can submit the patch so then at least it works on solr 1.4.0 Thanks, Edo On Wed, Feb 23, 2011 at 6:53 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Hi Isha, > > The patch is out of date. You need to look at the patch and rejection and > update your local copy of the code to match the logic from the patch, if > it's > still applicable to the version of Solr source code you have. > > Otis > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > - Original Message > > From: Isha Garg > > To: solr-user@lucene.apache.org > > Sent: Tue, February 22, 2011 2:13:23 AM > > Subject: AlternateDistributedMLT.patch not working > > > > Hello, > > > > I tried to use SOLR-788 with solr1.4 so that distributed MLT works > well . > >While working with this patch i got an error mesg like > > > > 1 out of 1 hunk FAILED -- saving rejects to file > >src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java.rej > > > > Can anybody help me out? > > > > Thanks! > > Isha Garg > > > > > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Indexed, but cannot search
Hi, i'm not sure if it is a typo, anyway the second query you mentioned should be: http://localhost:8983/solr/select/?q=type:* HTH, Edo On Tue, Mar 1, 2011 at 4:06 PM, Brian Lamb wrote: > Thank you for your reply but the searching is still not working out. For > example, when I go to: > > http://localhost:8983/solr/select/?q=*%3A*< > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on > > > > I get the following as a response: > > > >Mammal >1 >Canis > > > > (plus some other docs but one is enough for this example) > > But if I go to http://localhost:8983/solr/select/?q=type%3A< > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on > > > Mammal > > I only get: > > > > But it seems that should return at least the result I have listed above. > What am I doing incorrectly? > > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira wrote: > > > q=dog is equivalent to q=text:dog (where the default search field is > > defined as text at the bottom of schema.xml). > > > > If you want to specify a different field, well, you need to tell it :-) > > > > Is that it? > > > > Upayavira > > > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb" > > wrote: > > > Hi all, > > > > > > I was able to get my installation of Solr indexed using dataimport. > > > However, > > > I cannot seem to get search working. I can verify that the data is > there > > > by > > > going to: > > > > > > > > > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on > > > > > > This gives me the response: > > start="0"> > > > > > > But when I go to > > > > > > > > > http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on > > > > > > I get the response: > > > > > > I know that dog should return some results because it is the first > result > > > when I select all the records. So what am I doing incorrectly that > would > > > prevent me from seeing results? > > > > > --- > > Enterprise Search Consultant at Sourcesense UK, > > Making Sense of Open Source > > > > > -- Edoardo Tosca Sourcesense - making sense of Open Source: http://www.sourcesense.com
Re: Solr n00b question: writing a custom QueryComponent
Hi, i agree with Upayavira, probably it's better to create an external app that retrieves content from a db. Anyway, if i am not wrong, finishStage is a method called by the coordinator if you have a distributed search. if your solr is on a single machine every component should implement only prepare and process methods. HTH. Edo On Tue, Feb 8, 2011 at 7:17 AM, Ishwar wrote: > Hi all, > > Been a solr user for a while now, and now I need to add some functionality > to solr for which I'm trying to write a custom QueryComponent. Couldn't get > much help from websearch. So, turning to solr-user for help. > > I'm implementing search functionality for (micro)blog aggregation. We use > solr 1.4.1. In the current solr config, the title and content fields are > both indexed and stored in solr. Storing takes up a lot of space, even with > compression. I'd like to store the title and description field in solr in > mysql and retrieve these fields in results from MySQL with an id lookup. > > Using the DataImportHandler won't work because we store just the title and > content fields in MySQL. The rest of the fields are in solr itself. > > I wrote a custom component by extending QueryComponent, and overriding only > the finishStage(ResponseBuilder) function where I try to retrieve the > necessary records from MySQL. This is how the new QueryComponent is > specified in solrconfig.xml > > class="org.apache.solr.handler.component.TestSolr" /> > > > I see that the component is getting loaded from the solr debug output > > 1.0 > > 0.0 > > ... > > But the strange thing is that the finishStage() function is not being > called before returning results. What am I missing? > > Secondly, functions like ResponseBuilder._responseDocs are visible only in > the package org.apache.solr.handler.component. How do I access the results > in my package? > > If you folks can give me links to a wiki or some sample custom > QueryComponent, that'll be great. > > -- > Thanks in advance. > Ishwar. > > > Just another resurrected Neozoic Archosaur comics. > http://www.flickr.com/photos/mojosaurus/sets/72157600257724083/
DebugComponent behavour in a distributed environment
Hello everybody, i have some doubts about the current behaviour of DebugComponent at coordinator level in a sharded environment. I'm actually using Solr 1.4 While trying to test our current system using debugQuery=on i have seen that at coordinator level the timing element contains riduculous values if comparedwith the QTime value sticked inside the header. It basically reports only a subset of the time spent in executing the distributed query and sincerely i think that it doesn't make so much sense. After a quick debugging session i've discovered that the timing is calcultated only on the last request executed by the coordinator to every single node. The request is the one that contains only specific docIds and therefore the response time is usually fast. Digging inside the code i've seen that the method called modifyRequest takes care of setting debugQuery=false during the first request from the coordionator to every node. The question is: is there a specific reason why modifyRequest "turns off" debugQuery? I have started changing the code of this component. I've changed code of modifyRequest so that now it never disables the debug. Then i've sorted out how to retrieve timing values (divided per phase and component) for each node. Every group of information is identifiedy by the shard name. I've setted these information inside the standard timing element. I don't know if these information can be usuful to someone else, in case i can provide a patch, but most important i would like to be sure that changing modifyRequest does not affect the search (it shouldn't but i really appreciate a confirmation ) Thank you, Edo