Re: Near Duplicate Documents

2007-11-18 Thread rishabh9
Can anyone help me? Rishabh rishabh9 wrote: Hi, I am evaluating Solr 1.2 for my project and wanted to know if it can return near duplicate documents (near dups) and how do i go about it? I am not sure, but is MoreLikeThisHandler the implementation for near dups? Rishabh --

Re: Query multiple fields

2007-11-18 Thread Stuart Sierra
On Nov 18, 2007 1:50 AM, Dave C. [EMAIL PROTECTED] wrote: Maybe you can help me with this related problem I am having. My query is: q=description:(test)!(type:10)!(type:14). However, my results are not as expected (55 results instead of the expected 23) The response header shows:

Re: Near Duplicate Documents

2007-11-18 Thread Stuart Sierra
On Nov 18, 2007 10:50 AM, Eswar K [EMAIL PROTECTED] wrote: We have a scenario, where we want to find out documents which are similar in content. To elaborate a little more on what we mean here, lets take an example. The example of this email chain in which we are interacting on, can be best

Re: Near Duplicate Documents

2007-11-18 Thread Eswar K
We have a scenario, where we want to find out documents which are similar in content. To elaborate a little more on what we mean here, lets take an example. The example of this email chain in which we are interacting on, can be best used for illustrating the concept of near dupes (We are not

Re: Near Duplicate Documents

2007-11-18 Thread Eswar K
Is there any idea implementing that feature in the up coming releases? Regards, Eswar On Nov 18, 2007 9:35 PM, Stuart Sierra [EMAIL PROTECTED] wrote: On Nov 18, 2007 10:50 AM, Eswar K [EMAIL PROTECTED] wrote: We have a scenario, where we want to find out documents which are similar in

Performance of Solr on different Platforms

2007-11-18 Thread Eswar K
Hi, I understand that Solr can be used on different Linux flavors. Is there any preferred flavor (Like Red Hat, Ubuntu, etc)? Also what is the kind of configuration of hardware (Processors, RAM, etc) be best suited for the install? We expect to load it with millions of documents (varying from 2 -

Re: Near Duplicate Documents

2007-11-18 Thread Ryan McKinley
Eswar K wrote: We have a scenario, where we want to find out documents which are similar in content. To elaborate a little more on what we mean here, lets take an example. The example of this email chain in which we are interacting on, can be best used for illustrating the concept of near dupes

Finding all possible synonyms for a word

2007-11-18 Thread Kishore AVK. Veleti
Hi All, I am new to Lucene / SOLR and developing a POC as part of research. Check below my requirement and problem statement. Need help on how I can index the data such data I have a very good search functionality in my POC. --

RE: Query multiple fields

2007-11-18 Thread Stu Hood
q=description:(test)!(type:10)!(type:14) You can't use an '' symbol in your query (without escaping it). The boolean operator for 'and' in Lucene is 'AND': and it is case sensitive. Your query should probably look like: q=description:test AND -type:10 AND -type:14 See the Lucene query

Re: Payloads in Solr

2007-11-18 Thread Tricia Williams
Thanks for your comments, Yonik! All for it... depending on what one means by payload functionality of course. We should probably hold off on adding a new lucene version to Solr until the Payload API has stabilized (it will most likely be changing very soon). It sounds like Lucene 2.3 is

solrj users -- API feedback, suggestions, etc

2007-11-18 Thread Ryan McKinley
Hello- Solrj has been out there for a while, but is not yet baked into an official release. If there is anything major to change just so it feels better, now is the time. Here are a few things I'm thinking about: 1. The setFields() behavior Currently: query.setFields( name,id );

Re: Payloads, Tokenizers, and Filters. Oh My!

2007-11-18 Thread Tricia Williams
I apologize for cross-posting but I believe both Solr and Lucene users and developers should be concerned with this. I am not aware of a better way to reach both communities. In this email I'm looking for comments on: * Do TokenFilters belong in the Solr code base at all? * How to

Re: Query multiple fields

2007-11-18 Thread Yonik Seeley
On Nov 18, 2007 9:58 PM, Dave C. [EMAIL PROTECTED] wrote: According to the Lucene query syntax: The symbol can be used in place of the word AND. So, I shouldn't have to use 'AND'. Yes, but before the query parser can even get the query string, the servlet container parses query args and

RE: Query multiple fields

2007-11-18 Thread Dave C .
okay thanks for the details - David Date: Sun, 18 Nov 2007 22:14:23 -0500 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: Re: Query multiple fields On Nov 18, 2007 9:58 PM, Dave C. [EMAIL PROTECTED] wrote: According to the Lucene query syntax: The symbol can be

Re: Near Duplicate Documents

2007-11-18 Thread Mike Klaas
On 18-Nov-07, at 8:17 AM, Eswar K wrote: Is there any idea implementing that feature in the up coming releases? Not currently. Feel free to contribute something if you find a good solution g. -Mike On Nov 18, 2007 9:35 PM, Stuart Sierra [EMAIL PROTECTED] wrote: On Nov 18, 2007 10:50

Re: Finding all possible synonyms for a word

2007-11-18 Thread Eswar K
Kishore, Solr has a SynonymFilterFactory which might be off use to you ( http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46) Regards, Eswar On Nov 18, 2007 10:39 PM, Kishore AVK. Veleti [EMAIL PROTECTED] wrote: Hi All, I am new to

Re: multiple delete by id in one delete command?

2007-11-18 Thread climbingrose
The easiest solution I know is: deletequeryid:1 OR id:2 OR .../query/delete If you know that all of these ids can be found by issuing a query, you can do delete by query: deletequeryYOUR_DELETE_QUERY_HERE/query/delete Cheers On Nov 19, 2007 4:18 PM, Norberto Meijome [EMAIL PROTECTED] wrote: Hi

RE: I18N with SOLR?

2007-11-18 Thread Dilip.TS
Hello, Does SOLR supports searching for a keyword which has a combination of more than 1 language within the same search page? -Original Message- From: Guglielmo Celata [mailto:[EMAIL PROTECTED] Sent: Thursday, November 15, 2007 7:39 PM To:

RE: I18N with SOLR?

2007-11-18 Thread Dilip.TS
Hello, Also can we have something like this ? i.e having multiple defaultSearchField entries in the schema.xml while searching for a keyword which has a combination of more than 1 language: defaultSearchFieldtext/defaultSearchField

Re: multiple delete by id in one delete command?

2007-11-18 Thread Norberto Meijome
On Mon, 19 Nov 2007 16:53:17 +1100 climbingrose [EMAIL PROTECTED] wrote: The easiest solution I know is: deletequeryid:1 OR id:2 OR .../query/delete If you know that all of these ids can be found by issuing a query, you can do delete by query: deletequeryYOUR_DELETE_QUERY_HERE/query/delete