Re: how to avoid duplicates in search results?

2011-10-04 Thread Edoardo Tosca
You can probably use the Grouping feature:
http://wiki.apache.org/solr/FieldCollapsing#Request_Parameters

There is also a Document Duplicate Detection at index time:
http://wiki.apache.org/solr/Deduplication

On Tue, Oct 4, 2011 at 9:55 AM, nagarjuna wrote:

> Hi everybody
>  i got the following response
> 
> 
> - 
> - 
>  0
>  0
> - 
>  groups
>  on
>  0
>  participate
>  2.2
>  30
>  
>  
> - 
> - 
>  testing group
>  testing group
>   name="url">http://abc.xyz.com/groups/testing-group/discussions/62
>  
> - 
>  testing group
>  testing group
>   name="url">http://abc.xyz.com/groups/testing-group/discussions/62
>  
>  
>  
> 
>
> i need to remove the duplicte results
>
> can anyone give me suggestions
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-avoid-duplicates-in-search-results-tp3392524p3392524.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Problem with Filter Query

2011-07-14 Thread Edoardo Tosca
As far as i know if you add multiple FQs they will be joined always with
AND.
You can do something like
fq={!q.op=OR df=supplierName}first second third ...

HTH

Edo


On Thu, Jul 14, 2011 at 3:50 PM, Kissue Kissue  wrote:

> No its not a multivalue field. Yes i can see that it looks like its doing
> an
> AND on all the filter values but how can i get it to do an OR?
> I just want it to return documents that have any of the supplied values as
> their supplier name.
>
> I have also tried: solrQuery.addFilterQuery(arrayOfSupplierNames) and i get
> no results too.
>
> Thanks.
>
> On Thu, Jul 14, 2011 at 3:06 PM, Edoardo Tosca  >wrote:
>
> > So with
> > &fq=supplierName:first&fq=supplierName:second
> > you don't get any results?
> >
> > is this field a multivalue?
> > Mutliple FQs are evaluated as AND
> > so your document must have in supplierName both "first" and "second"
> >
> > Edo
> >
> >
> > On Thu, Jul 14, 2011 at 3:00 PM, Kissue Kissue 
> > wrote:
> >
> > > Thanks for your response.
> > >
> > > Actually the elements are composed as follows:
> > > &fq=first&fq=second
> > >
> > > But using Solr admin query screen i have modified the query to:
> > > &fq=supplierName:first&fq=supplierName:second....
> > > i still get the same results.
> > >
> > > I will try to use solrQuery.addFilterQuery(arrayOfSupplierNames) like
> you
> > > suggested and see how it goes.
> > >
> > > Thanks.
> > >
> > >
> > > On Thu, Jul 14, 2011 at 2:49 PM, Edoardo Tosca <
> e.to...@sourcesense.com
> > > >wrote:
> > >
> > > > Hi,
> > > > have you tried with:
> > > > solrQuery.addFilterQuery(arrayOfSupplierNames) ?
> > > >
> > > > other question, is every element of your array composed in this way:
> > > > supplierName:FIRST
> > > > supplierName:SECOND
> > > > etc..
> > > >
> > > > HTH
> > > > edo
> > > >
> > > > On Thu, Jul 14, 2011 at 2:18 PM, Kissue Kissue 
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am using Solr 3.1 with SolrJ. I have a field called supplierName
> in
> > > my
> > > > > index which i am trying to do filtering on. When i select about 5
> > > > suppliers
> > > > > to filter on at the same time and use their supplier name to
> contruct
> > a
> > > > > filter query i do not get any results but when i filter which each
> > > > > individual supplier name i get the required results.
> > > > >
> > > > > Here is the line code to that i used to contruct the filter query:
> > > > >
> > > > > *solrQuery.setParam("fq", arrayOfSupplierNames);
> > > > >
> > > > > *The supplier name field is stored as a string in the index and
> here
> > is
> > > > the
> > > > > config for the string type from my schema.xml file:
> > > > >
> > > > > 
> > > > > > > sortMissingLast="true"
> > > > > omitNorms="true"/>
> > > > >
> > > > > Any help why this is happening will be much appreciated.
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Edoardo Tosca
> > > > Sourcesense - making sense of Open Source:
> http://www.sourcesense.com
> > > >
> > >
> >
> >
> >
> > --
> > Edoardo Tosca
> > Sourcesense - making sense of Open Source: http://www.sourcesense.com
> >
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Upgrading solr from 1.4 to latest version

2011-07-14 Thread Edoardo Tosca
I think that at the moment there isn't any Ubuntu package available with
solr 3.x
My suggestion is to uninstall it (via apt-get) and "install" solr manually
in your /opt or wherever you want
After all what you have to do is easily extract the zipped archive.


Edo

On Wed, Jul 13, 2011 at 1:35 AM, rvidela  wrote:

> Hi,
>
> I am new to Solr. In little time, I am very much impressed with its search
> performance. I have installed Solr on Ubuntu using "*apt-get install
> solr-tomcat curl -y*" command. From admin page, I can see that solr version
> is 1.4.1. But i see there is 3.x version already available. Just wondering
> if there is any easy way to upgrade it to latest version.
>
> Tried specifying version number in apt-get, But it does not work.
> Appreciate
> your help.
>
> Thanks
> Ravi
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Upgrading-solr-from-1-4-to-latest-version-tp3164312p3164312.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Problem with Filter Query

2011-07-14 Thread Edoardo Tosca
So with
&fq=supplierName:first&fq=supplierName:second
you don't get any results?

is this field a multivalue?
Mutliple FQs are evaluated as AND
so your document must have in supplierName both "first" and "second"

Edo


On Thu, Jul 14, 2011 at 3:00 PM, Kissue Kissue  wrote:

> Thanks for your response.
>
> Actually the elements are composed as follows:
> &fq=first&fq=second
>
> But using Solr admin query screen i have modified the query to:
> &fq=supplierName:first&fq=supplierName:second
> i still get the same results.
>
> I will try to use solrQuery.addFilterQuery(arrayOfSupplierNames) like you
> suggested and see how it goes.
>
> Thanks.
>
>
> On Thu, Jul 14, 2011 at 2:49 PM, Edoardo Tosca  >wrote:
>
> > Hi,
> > have you tried with:
> > solrQuery.addFilterQuery(arrayOfSupplierNames) ?
> >
> > other question, is every element of your array composed in this way:
> > supplierName:FIRST
> > supplierName:SECOND
> > etc..
> >
> > HTH
> > edo
> >
> > On Thu, Jul 14, 2011 at 2:18 PM, Kissue Kissue 
> > wrote:
> >
> > > Hi,
> > >
> > > I am using Solr 3.1 with SolrJ. I have a field called supplierName in
> my
> > > index which i am trying to do filtering on. When i select about 5
> > suppliers
> > > to filter on at the same time and use their supplier name to contruct a
> > > filter query i do not get any results but when i filter which each
> > > individual supplier name i get the required results.
> > >
> > > Here is the line code to that i used to contruct the filter query:
> > >
> > > *solrQuery.setParam("fq", arrayOfSupplierNames);
> > >
> > > *The supplier name field is stored as a string in the index and here is
> > the
> > > config for the string type from my schema.xml file:
> > >
> > > 
> > > sortMissingLast="true"
> > > omitNorms="true"/>
> > >
> > > Any help why this is happening will be much appreciated.
> > >
> > > Thanks.
> > >
> >
> >
> >
> > --
> > Edoardo Tosca
> > Sourcesense - making sense of Open Source: http://www.sourcesense.com
> >
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Problem with Filter Query

2011-07-14 Thread Edoardo Tosca
Hi,
have you tried with:
solrQuery.addFilterQuery(arrayOfSupplierNames) ?

other question, is every element of your array composed in this way:
supplierName:FIRST
supplierName:SECOND
etc..

HTH
edo

On Thu, Jul 14, 2011 at 2:18 PM, Kissue Kissue  wrote:

> Hi,
>
> I am using Solr 3.1 with SolrJ. I have a field called supplierName in my
> index which i am trying to do filtering on. When i select about 5 suppliers
> to filter on at the same time and use their supplier name to contruct a
> filter query i do not get any results but when i filter which each
> individual supplier name i get the required results.
>
> Here is the line code to that i used to contruct the filter query:
>
> *solrQuery.setParam("fq", arrayOfSupplierNames);
>
> *The supplier name field is stored as a string in the index and here is the
> config for the string type from my schema.xml file:
>
> 
> omitNorms="true"/>
>
> Any help why this is happening will be much appreciated.
>
> Thanks.
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: about standardAnaylzer in solr

2011-07-13 Thread Edoardo Tosca
Try to change from StandardTolkenizerFactory to ClassicTokenizerFactory or
create your own fieldType


  
**
...



Edo

On Wed, Jul 13, 2011 at 3:40 PM, Kiwi de coder  wrote:

> hi,
>
> I using solr 3.3 which in schema.xml contain this :
>
>
>  
>
>
> i use the sentences as example "XY&Z Corporation - x...@example.com"
>
> however, when I try on /analysis.jsp, it show difference result compare to
> using Lucene.
>
> using solr I got result below when using "text_standard" and "text_general"
> (is both the same ?)
>
> XYZCorporationxyzexample.com (which all belong to
> 
> )
>
> when using Lucene, i got this
>
>  StandardAnalyzer:
>
> 1: [xy&z:0->4:]
> 2: [corporation:5->16:]
> 3: [x...@example.com:19->34:]
>
>
> so my question is, how to make it analysis like in Lucene ?
>
> regards,
> kiwi
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Multiple indexes

2011-06-15 Thread Edoardo Tosca
Try to use multiple cores:
http://wiki.apache.org/solr/CoreAdmin

On Wed, Jun 15, 2011 at 5:55 PM, shacky  wrote:

> Hi.
>
> How to have multiple indexes in SOLR, with different fields and
> different types of data?
>
> Thank you very much!
> Bye.
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: AlternateDistributedMLT.patch not working

2011-03-03 Thread Edoardo Tosca
Hi all,
I am currently working on this AlternateDistributedMLT patch.
I've applied it manually on solr 1.4 an solved some Null Pointer Exception
issues.
It's now working properly.

But I'm not sure about its behaviour so i'll ask you, list:

I saw that every MLT query for a doc that is in the resultset runs only on
its shard (the one where the doc is in the index).
This means that you can miss documents, probably related to the doc but not
retrieved because they belong to other shards.

Does it make sense?
Is it the expected behavoiur?

If it is, i can submit the patch so then at least it works on solr 1.4.0

Thanks,

Edo


On Wed, Feb 23, 2011 at 6:53 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi Isha,
>
> The patch is out of date.  You need to look at the patch and rejection and
> update your local copy of the code to match the logic from the patch, if
> it's
> still applicable to the version of Solr source code you have.
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Isha Garg 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, February 22, 2011 2:13:23 AM
> > Subject: AlternateDistributedMLT.patch not working
> >
> > Hello,
> >
> >  I tried to use SOLR-788 with solr1.4 so that  distributed MLT works
> well .
> >While working with this patch i got an error mesg  like
> >
> > 1 out of 1 hunk FAILED -- saving rejects to file
> >src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java.rej
> >
> > Can  anybody help me out?
> >
> > Thanks!
> > Isha Garg
> >
> >
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Indexed, but cannot search

2011-03-01 Thread Edoardo Tosca
Hi,
i'm not sure if it is a typo, anyway the second query you mentioned should
be:
http://localhost:8983/solr/select/?q=type:*

HTH,

Edo

On Tue, Mar 1, 2011 at 4:06 PM, Brian Lamb wrote:

> Thank you for your reply but the searching is still not working out. For
> example, when I go to:
>
> http://localhost:8983/solr/select/?q=*%3A*<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
>
> I get the following as a response:
>
> 
>  
>Mammal
>1
>Canis
>  
> 
>
> (plus some other docs but one is enough for this example)
>
> But if I go to http://localhost:8983/solr/select/?q=type%3A<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> Mammal
>
> I only get:
>
> 
>
> But it seems that should return at least the result I have listed above.
> What am I doing incorrectly?
>
> On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
>
> > q=dog is equivalent to q=text:dog (where the default search field is
> > defined as text at the bottom of schema.xml).
> >
> > If you want to specify a different field, well, you need to tell it :-)
> >
> > Is that it?
> >
> > Upayavira
> >
> > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> >  wrote:
> > > Hi all,
> > >
> > > I was able to get my installation of Solr indexed using dataimport.
> > > However,
> > > I cannot seem to get search working. I can verify that the data is
> there
> > > by
> > > going to:
> > >
> > >
> >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> > >
> > > This gives me the response:  > > start="0">
> > >
> > > But when I go to
> > >
> > >
> >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> > >
> > > I get the response: 
> > >
> > > I know that dog should return some results because it is the first
> result
> > > when I select all the records. So what am I doing incorrectly that
> would
> > > prevent me from seeing results?
> > >
> > ---
> > Enterprise Search Consultant at Sourcesense UK,
> > Making Sense of Open Source
> >
> >
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Solr n00b question: writing a custom QueryComponent

2011-02-08 Thread Edoardo Tosca
Hi,

i agree with Upayavira, probably it's better to create an external app that
retrieves content from a db.
Anyway, if i am not wrong,
finishStage is a method called by the coordinator if you have a distributed
search.

if your solr is on a single machine every component should implement only
prepare and process methods.

HTH.

Edo



On Tue, Feb 8, 2011 at 7:17 AM, Ishwar  wrote:

> Hi all,
>
> Been a solr user for a while now, and now I need to add some functionality
> to solr for which I'm trying to write a custom QueryComponent. Couldn't get
> much help from websearch. So, turning to solr-user for help.
>
> I'm implementing search functionality for  (micro)blog aggregation. We use
> solr 1.4.1. In the current solr config, the title and content fields are
> both indexed and stored in solr. Storing takes up a lot of space, even with
> compression. I'd like to store the title and description field in solr in
> mysql and retrieve these fields in results from MySQL with an id lookup.
>
> Using the DataImportHandler won't work because we store just the title and
> content fields in MySQL. The rest of the fields are in solr itself.
>
> I wrote a custom component by extending QueryComponent, and overriding only
> the finishStage(ResponseBuilder) function where I try to retrieve the
> necessary records from MySQL. This is how the new QueryComponent is
> specified in solrconfig.xml
>
>  class="org.apache.solr.handler.component.TestSolr" />
>
>
> I see that the component is getting loaded from the solr debug output
> 
> 1.0
> 
> 0.0
> 
> ...
>
> But the strange thing is that the finishStage() function is not being
> called before returning results. What am I missing?
>
> Secondly, functions like ResponseBuilder._responseDocs are visible only in
> the package org.apache.solr.handler.component. How do I access the results
> in my package?
>
> If you folks can give me links to a wiki or some sample custom
> QueryComponent, that'll be great.
>
> --
> Thanks in advance.
> Ishwar.
>
>
> Just another resurrected Neozoic Archosaur comics.
> http://www.flickr.com/photos/mojosaurus/sets/72157600257724083/


DebugComponent behavour in a distributed environment

2010-10-01 Thread Edoardo Tosca
Hello everybody,
i have some doubts about the current behaviour of DebugComponent at
coordinator level in a sharded environment.
I'm actually using Solr 1.4
While trying to test our current system using debugQuery=on i have seen that
at coordinator level the timing element contains riduculous values if
comparedwith the QTime value sticked inside the header.
It basically reports only a subset of the time spent in executing the
distributed query and sincerely i think that it doesn't make so much sense.

After a quick debugging session i've discovered that the timing is
calcultated only on the last request executed by the coordinator to every
single node.
The request is the one that contains only specific docIds and therefore the
response time is usually fast.
Digging inside the code i've seen that the method called modifyRequest takes
care of setting debugQuery=false during the first request from the
coordionator to every node.

The question is:
is there a specific reason why modifyRequest "turns off" debugQuery?

I have started changing the code of this component.
I've changed code of modifyRequest so that now it never disables the debug.
Then i've sorted out how to retrieve timing values (divided per phase and
component) for each node. Every group of information is identifiedy by the
shard name.
I've setted these information inside the standard timing element.

I don't know if these information can be usuful to someone else, in case i
can provide a patch,
but most important i would like to be sure that changing modifyRequest does
not affect the search (it shouldn't but i really appreciate a confirmation )

Thank you,

Edo