lucene search options

2008-06-23 Thread Aditi Goyal
Hi All, I am using Lucene for creating indexes. There is one field as email which stored the email id. I have few queries regarding searching: 1. I want to search for all the records having domain as gmail.com . So, is there a way by which I can do a partial search on email field such that I get

RE: lucene search options

2008-06-23 Thread Allahbaksh Mohammedali Asadullah
One way of doing it is while parsing email if it has .gmail.com add it to different field also. Warm Regards, Allahbaksh Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronics City, Hosur Road, Bangalore 560 100, India. *Board:

Re: lucene search options

2008-06-23 Thread Aditi Goyal
Thanks Allahbaksh, But this was just an example. I want to search for lot more fields like this. On Mon, Jun 23, 2008 at 11:55 AM, Allahbaksh Mohammedali Asadullah [EMAIL PROTECTED] wrote: One way of doing it is while parsing email if it has .gmail.com add it to different field also. Warm

Re: lucene search options

2008-06-23 Thread Daniel Noll
On Monday 23 June 2008 16:21:17 Aditi Goyal wrote: I think wildcard (*) cannot be used in the beginning :( Wrong: http://lucene.apache.org/java/2_3_0/api/core/org/apache/lucene/queryParser/QueryParser.html#setAllowLeadingWildcard(boolean) Daniel

RE: lucene search options

2008-06-23 Thread Allahbaksh Mohammedali Asadullah
Usually if you negate the keyword then that document will be skipped. So what else you want to search. Design of fields is very important. Warm Regards, Allahbaksh Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronics City, Hosur Road,

RE: lucene search options

2008-06-23 Thread Allahbaksh Mohammedali Asadullah
Hi Daniel, You are right earlier the wild card character were not supported but now they are supported. Regards, Allahbaksh Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronics City, Hosur Road, Bangalore 560 100, India. *Board:

Re: lucene search options

2008-06-23 Thread Aditi Goyal
Thanks a lot Daniel, I will try this option. :) Is there a way to search for not equal to query alone? On Mon, Jun 23, 2008 at 12:03 PM, Daniel Noll [EMAIL PROTECTED] wrote: On Monday 23 June 2008 16:21:17 Aditi Goyal wrote: I think wildcard (*) cannot be used in the beginning :( Wrong:

RE: lucene search options

2008-06-23 Thread Allahbaksh Mohammedali Asadullah
Hi Aditi, You can search using NOT operator. You can go through below link for Query details http://lucene.apache.org/java/docs/queryparsersyntax.html Warm Regards, Allahbaksh Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronics City,

Re: lucene search options

2008-06-23 Thread Aditi Goyal
Thank you Allahbaksh for taking so much pains. The link says it cannot be used alone. Note: The NOT operator cannot be used with just one term. For example, the following search will return no results: NOT jakarta apache Thanks, Aditi On Mon, Jun 23, 2008 at 1:10 PM, Allahbaksh Mohammedali

RE: lucene search options

2008-06-23 Thread Allahbaksh Mohammedali Asadullah
Yes Aditi You can use it alone. Warm regards, Allahbaksh Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronics City, Hosur Road, Bangalore 560 100, India. *Board: +91-80-28520261 | Extn: 53915 | Direct: 41173915. Fax: +91-80-28520362 |

RE: lucene search options

2008-06-23 Thread Allahbaksh Mohammedali Asadullah
Hi Aditi, Sorry for typo. You cannot edit it alone. Warm Regards, Allahbaksh Allahbaksh Mohammedali Asadullah, Software Engineering Technology Labs, Infosys Technolgies Limited, Electronics City, Hosur Road, Bangalore 560 100, India. *Board: +91-80-28520261 | Extn: 53915 | Direct: 41173915. Fax:

Re: lucene search options

2008-06-23 Thread Aditi Goyal
Oh. For one moment I was elated to hear the news. :( Is there any way out? Thanks, Aditi On Mon, Jun 23, 2008 at 1:33 PM, Allahbaksh Mohammedali Asadullah [EMAIL PROTECTED] wrote: Hi Aditi, Sorry for typo. You cannot edit it alone. Warm Regards, Allahbaksh Allahbaksh Mohammedali

Re: lucene search options

2008-06-23 Thread Aditi Goyal
I am using MultiFieldQueryParser. Can I use setAllowLeadingWildCard with MultiFieldQueryParser?I am doing the following: parser = lucene.MultiFieldQueryParser(fields, analyzer ) parser.setAllowLeadingWildcard(True) query = parser.parse(command) And I am

Re: lucene search options

2008-06-23 Thread rohit saini
Hi aditi, U can have a field corresponding to this domainIt will solve ur problem.. bye Rohit Impetus technologies- noida mob. 09910220475 On 6/23/08, Aditi Goyal [EMAIL PROTECTED] wrote: Hi All, I am using Lucene for creating indexes. There is one field as email which stored the

Re: lucene search options

2008-06-23 Thread saikrishna venkata pendyala
Hi Aditi, As Rohit said, the best way to solve this problem is to have two fields for indexing an email-id, one for the username and another for domain name. Having two fields and using a Boolean Query with MUST condition should solve your problem. You can also perform restricted search with the

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread László Monda
On Wed, 2008-06-18 at 21:10 +0200, Daniel Naber wrote: On Mittwoch, 18. Juni 2008, László Monda wrote: Additional info: Lucene seems to do the right thing when only few documents are present, but goes crazy when there is about 1.5 million documents in the index. Lucene works well with

Re: creating Array of IndexReaders

2008-06-23 Thread saikrishna venkata pendyala
Hi Sebastin , Why dont you index all data into single or some fixed number of indexes and search over them. You can always restrict your search using the range query based on user given inputs. --Sai Krishna. On Mon, Jun 23, 2008 at 7:27 AM, Daniel Noll [EMAIL PROTECTED] wrote: On Saturday 21

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread László Monda
Hi Mark, On Wed, 2008-06-18 at 21:09 +0100, markharw00d wrote: This looks like it is related to an issue I first raised here: http://markmail.org/message/37ywsemfudpos6uh At the time I identified 2 issues with FuzzyQuery - that the usual coord and idf scoring factors shouldn't be

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread László Monda
Hi Daniel, On Wed, 2008-06-18 at 20:37 +0200, Daniel Naber wrote: On Mittwoch, 18. Juni 2008, László Monda wrote: Since fuzzy searching is based on the Levenshtein distance, the distance between coldplay and coldplay is 0 and the distance between coldplay and downplay is 3 so how on

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread mark harwood
I do have serious problems with the relevance of the results with fuzzy queries. Please take the time to read my response here: http://www.gossamer-threads.com/lists/lucene/java-user/62050#62050 I had a work colleague come up with exactly the same problem this week and the solution is

Help unsubscribing

2008-06-23 Thread William Thimbleby
I've tried everything I can think of and I still can't unsubscribe from java-user@lucene.apache.org . None of my unsubscribe or emails to java-user-unsubscribe or java-user-help seem to do anything. Any help would be appreciated. thanks -- Will

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread László Monda
Thanks for your reply, Mark. This was my original code for constructing my query using FuzzyQuery: BooleanQuery query = new BooleanQuery(); if (artist.length() 0) { FuzzyQuery artist_query = new FuzzyQuery(new Term(artist, artist)); query.add(artist_query, BooleanClause.Occur.MUST); }

Requesting MultipleIndeces

2008-06-23 Thread Sascha Fahl
Hi, I have around 10 different indexfiles to request. Is it better to do this via one request to one MultiReader or is better to request the 10 indeces one after another? Especially for doing some filtering on the result with a HitCollector it might by easier to use the multi request

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread mark harwood
Could you tell me what's wrong here, please? There are potentially a number of factors at play here. Your use of FuzzyLikeThis is fine - just tried the code on my single-term Paul query and as I outlined before it is doing a much better job of matching (Paul~= results Paul,Paul,PaulPhul

Re: Indexing and searching txt files

2008-06-23 Thread jnance
Thanks! Lucene in Action is very helpful. -James -- View this message in context: http://www.nabble.com/Indexing-and-searching-txt-files-tp18031330p18067808.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: Termdocs question

2008-06-23 Thread Vinicius Carvalho
I'm sorry, the problem was with the way the id was being indexed. It was marked as tokenized, so I when searched for it's untokenized form I was not getting the doc, now everything works fine :) Regards On Sat, Jun 21, 2008 at 2:08 PM, Karl Wettin [EMAIL PROTECTED] wrote: 20 jun 2008 kl.

Auto Completion search

2008-06-23 Thread Lukas Öesterreicher
Hello. I am trying to implement a search based on a search text in an index that contains Track Title, Album Name or Artist Name information that delivers a list or results that are suited for auto completion to make searching easier for the user. This search is very performance critical. The

Re: Auto Completion search

2008-06-23 Thread Daniel Rosher
Hello, Basically I think you need to use NGramFilter, this will be alot faster than the searches you list, but will make your index much larger too. In Solr this can be achieved with something like: fieldType name=acomplete class=solr.TextField analyzer type=index

Re: Getting irrelevant results using fuzzy query

2008-06-23 Thread Daniel Naber
On Montag, 23. Juni 2008, László Monda wrote: According to the current Lucene documentation at http://lucene.apache.org/java/2_3_2/api/index.html it seems to me that the Query class doesn't have any explain() methods. It's in the IndexSearcher and it takes a query and a document number as its

uniqueWords, and termDocs

2008-06-23 Thread Cam Bazz
Hello, I need to be able to select a random word out of all the words in my index. how can I do this tru termDocs() ? Also, I need to get a list of unique words as well. Is there a way to ask this to lucene? Best Regards, -C.B.

Re: Token payload attribute partitioning

2008-06-23 Thread Grant Ingersoll
On Jun 21, 2008, at 12:56 PM, Karl Wettin wrote: How do you handle token payload that represent multiple values? I simply don't do it even though there are cases where I would like to see it. I also find that my token filters that update payload feels sort of quick and dirty, that it use

Re: java.io.Ioexception cannot overwrite fdt

2008-06-23 Thread Michael McCandless
Can you describe how you are using Lucene, and provide a full traceback? Mike Sebastin wrote: Hi All, I am facing this error while doing Indexing text files.can anyone guide me how to resolve this issue. -- View this message in context:

Re: java.io.Ioexception cannot overwrite fdt

2008-06-23 Thread Sebastin
Hi, I am using Lucene to index text based files : File file = new File(C:/index); if(file.exists() == false{ IndexWriter writer = new IndexWriter(file,new StandardAnalyzer(),true); }

Re: lucene search options

2008-06-23 Thread Daniel Noll
On Monday 23 June 2008 18:08:29 Aditi Goyal wrote: Oh. For one moment I was elated to hear the news. :( Is there any way out? *:* -jakarta apache Or subclass QueryParser and override the getBooleanQuery() method to do this behind the scenes using MatchAllDocsQuery. Daniel