RE: frequent terms - Re: combining open office spellchecker with Lucene

2004-09-16 Thread Aad Nales
Also, You can also use an alternative spellchecker for the 'checking part' and use the Ngram algorithm for the 'suggestion' part. Only if the spell 'check' declares a word illegal the 'suggestion' part would perform its magic. cheers, Aad Doug Cutting wrote: David Spencer wrote: [1] The

Re: Lucene docs

2004-09-16 Thread Daniel Mnard
I also found these two articles http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html from Otis Gospodnetic very usefull for me. DM - Original Message - From: Ian McDonnell [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent:

problem with locks when updating the data of a previous stored do cument

2004-09-16 Thread Paul Williams
Hi, Using lucene-1.4.1.jar on WinXP I am having trouble with locking and updating an existing Lucene document. I delete the old document from the index and then add the new document to the index writer. I am using the minMerge docs set to 100 (much quicker!!) and close the writer once the

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-16 Thread Morus Walter
Hi David, Based on this mail I wrote a ngram speller for Lucene. It runs in 2 phases. First you build a fast lookup index as mentioned above. Then to correct a word you do a query in this index based on the ngrams in the misspelled word. Let's see. [1] Source is attached and I'd

Re: Term highlighting and Term vector patch

2004-09-16 Thread Terry Steichen
Christoph, Just curious - how are you currently using Term Vectors? They seem to be a neat feature with lots of future promise, but I'm not sure how to best use them now. Regards, Terry - Original Message - From: Christoph Goller To: Lucene Developers List Sent: Thursday,

RE: problem with locks when updating the data of a previous stored document

2004-09-16 Thread Crump, Michael
You have to close the IndexReader after doing the delete, before opening the IndexWriter for the addition. See information at this link: http://wiki.apache.org/jakarta-lucene/UpdatingAnIndex Regards, Michael -Original Message- From: Paul Williams [mailto:[EMAIL PROTECTED] Sent:

SortField[] in search method (newbie)

2004-09-16 Thread Wermus Fernando
Why can I search fields that aren't indexed without any problem, but if I ask for ordering by one of these fields, I got a runtime exception? I can't understand this behavior. I mean if I don't have any document indexed with these fields, why can I search by these fields and I can't ask for

SortField[] solved. Warning

2004-09-16 Thread Wermus Fernando
Luceners, I can order without having any document indexed. The problem was that I make an instance of Sort(String name) instead of Sort(String name, int type). According to the api, Sort(String name) will automatically detect the type of the field. But It couldn't detect a field type

IndexReader.close() semantics and optimize -- Re: problem with locks when updating the data of a previous stored document

2004-09-16 Thread David Spencer
Crump, Michael wrote: You have to close the IndexReader after doing the delete, before opening the IndexWriter for the addition. See information at this link: http://wiki.apache.org/jakarta-lucene/UpdatingAnIndex Recently I thought I observed that if I use this batch update idiom (1st delete

RE: QueryParser.parse() and Lucene1.4.1

2004-09-16 Thread Polina Litvak
Hi Daniel, I just downloaded the latest version of Lucene and tried the whole thing again: I ran my code first with lucene-1.3-final.jar, getting the query Field:(A AND -(B)) parsed into +Field:A -Field:B, and then I ran exactly the same code with lucene-1.4.1.jar and got the output parsed into

Re: LUCENE + PHP ???

2004-09-16 Thread Erik Hatcher
On Sep 15, 2004, at 1:45 PM, Karthik N S wrote: Hi Erik , Doug , Otis This is general forum - no need to address individuals. 1) Is a there a PHP version of Lucene Implemantation avaliable , If so Where ? Using the Java version of Lucene from PHP is my recommendation. There is not a PHP

Re: QueryParser.parse() and Lucene1.4.1

2004-09-16 Thread Daniel Naber
On Thursday 16 September 2004 19:38, Polina Litvak wrote: I ran my code first with lucene-1.3-final.jar, getting the query Field:(A AND -(B)) parsed into +Field:A -Field:B This code: Query query = QueryParser.parse(Field:(AAA AND -(BBB)), field, new StandardAnalyzer());

Re: NGramSpeller contribution -- Re: combining open office spellchecker with Lucene

2004-09-16 Thread David Spencer
Morus Walter wrote: Hi David, Based on this mail I wrote a ngram speller for Lucene. It runs in 2 phases. First you build a fast lookup index as mentioned above. Then to correct a word you do a query in this index based on the ngrams in the misspelled word. Let's see. [1] Source is attached

Re: PHP and Lucene

2004-09-16 Thread Erik Hatcher
On Sep 15, 2004, at 1:45 PM, Karthik N S wrote: Hi Erik , Doug , Otis This is general forum - no need to address individuals. 1) Is a there a PHP version of Lucene Implemantation avaliable , If so Where ? Using the Java version of Lucene from PHP is my recommendation. There is not a PHP

Re: Running OutOfMemory while optimizing and searching

2004-09-16 Thread John Z
Hi We are trying to get the memory footprint on our searchers. We have indexes of around 1 million docs and around 25 searchable fields. We noticed that without any searches performed on the indexes, on startup, the memory taken up by the searcher is roughly 7 times the .tii file size. The

FAO Otis Gospodnetic

2004-09-16 Thread Ian McDonnell
Was there any more articles liek this written. I found them extremely useful for getting an in depth understanding of how Lucenes indexing system worked. Is there a corresponding set for Lucenes search functionality. The recommendations from other Lucene users were useful, but didn't go to the

Re: FAO Otis Gospodnetic

2004-09-16 Thread Otis Gospodnetic
Ian, There are links to a few more articles on Lucene Wiki pages. They don't go into a lot of detail, as far as I know, and mostly deal with the basic search functionality. However, a book about Lucene is going to be published soon, and it will have a lot of detailed information about search

Re: Concurent operations with Lucene

2004-09-16 Thread Otis Gospodnetic
Nutch is a robust, multi-threaded Java web crawler and a (distributed) search engine. Nutch uses Lucene to index web pages and search the resulting indices. Doug Cutting is the padre of both Nutch and Lucene. Otis --- Terry Steichen [EMAIL PROTECTED] wrote: Otis, What's the relationship

Nutch vs Lucene

2004-09-16 Thread Terry Steichen
So Nutch uses (but doesn't enhance) Lucene? Or, does it enhance Lucene in its ability to operate in a distributed fashion? Regards, Terry PS: I'm aware of Doug's involvement in both - which is partly why I'm puzzled. - Original Message - From: Otis Gospodnetic To: Lucene

Re: Nutch vs Lucene

2004-09-16 Thread Otis Gospodnetic
Nutch is built on top and around Lucene. It lets you run a query against a cluster of 'search servers', for instance, using custom code that you can see in Nutch's CVS. Otis --- Terry Steichen [EMAIL PROTECTED] wrote: So Nutch uses (but doesn't enhance) Lucene? Or, does it enhance Lucene in