Re: Tool for analyzing analyzers

2004-06-02 Thread Zilverline
Hi Erik, Thanks for your reply. Have you tried it on a collection yet? I'd love the get some of your feedback. I have limited knowledge of the underlying capabilities of the lucene library, which is a complement to you, since it was extremely easy to integrate lucene. But I'd like to get more

Re: a list of matching search term

2004-06-02 Thread Erik Hatcher
On Jun 1, 2004, at 9:19 PM, Anson Lau wrote: Further to my previous email: The highlighter package should be able to pick up the matching search terms. Can some experienced highlighter package users tell me if I should look down that line? Yes, Highlighter (available in the sandbox) picks out

Range Query Sombody HELP please

2004-06-02 Thread Karthik N S
Hey Ype/Erick Thx in advance in helping me for the Range of Queries. Finally I was able to trace the wrong process within my code and closed them. I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt)

Re: optimize() is not merging into single file? !!!!!!

2004-06-02 Thread iouli . golovatyi
I rechecked the results. Here they are: IndexWriter compiled with v.1.4-rc2 generates after optimization _36d.cfs3779 kb IndexWriter compiled with v.1.4-rc3 generates after optimization _36d.cfs 3778 kb _36c.cfs31 kb _35z.cfs14 kb _35o.cfs 14 kb . etc. I both cases

Re: Range Query Sombody HELP please

2004-06-02 Thread Erik Hatcher
On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: Hey Ype/Erick If you're gonna ask for help, the least ya could do is spell my name correctly :) I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt)

RE : optimize() is not merging into single file? !!!!!!

2004-06-02 Thread Rasik Pandey
Hello, I am running a two-week old version of Lucene from the CVS HEAD and seeing the same behavior.? Regards, RBP -Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Envoy : mercredi 2 juin 2004 13:53 : Lucene Users List Objet : Re: optimize() is not

Re: Tool for analyzing analyzers

2004-06-02 Thread Leo Galambos
Zilverline [EMAIL PROTECTED] wrote: __ get more out of lucene, such as incremental indexing, to name one. On Hello, as far as I know, the incremental indexing could be a real bottleneck if you implemented your system without some knowledge about Lucene internals. The respective

indexing french text with Lucene

2004-06-02 Thread uddam chukmol
Hi all, Lucene is a very powerful tool for english document indexing. I really wonder if it's that powerful to index french text. In fact, I need to compute the similarity between 2 french texts. So, if somebody has already had the experience of indexing french text, your ideas and

Re: similarity of two texts

2004-06-02 Thread Terry Steichen
Erik, Could you expand on this just a wee bit, perhaps with an example of how to compute this vector angle? TIA, Terry - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Tuesday, June 01, 2004 9:39 AM Subject: Re: similarity of two

Can I prevent Sort fields from influencing score?

2004-06-02 Thread Andy Goodell
I have been using the new lucene 1.4 SortField implementation wih some custom fields added to old indexes so that the results can be sorted by them. My problem here is that some of the String fields that I add to the index come up in the search terms, so my results in sort by score order are

Re: similarity of two texts

2004-06-02 Thread David Spencer
Terry Steichen wrote: Erik, Could you expand on this just a wee bit, perhaps with an example of how to compute this vector angle? I'm tempted to write the code to see how it works, but FYI this doc seems to nicely explain the concepts:

Re: similarity of two texts

2004-06-02 Thread Erik Hatcher
On Jun 2, 2004, at 1:39 PM, David Spencer wrote: Erik, Could you expand on this just a wee bit, perhaps with an example of how to compute this vector angle? I'm tempted to write the code to see how it works, but FYI this doc seems to nicely explain the concepts:

Re: similarity of two texts - another question

2004-06-02 Thread Gerard Sychay
Hmm, the term vector does not have to consist of only term frequencies, does it? To give weight to rare terms, could you create a term vector of (TF*IDF) values for each term? Then, a distance function would measure how many terms two vectors have in common, giving weight to how many rare terms

Re: similarity of two texts - another question

2004-06-02 Thread David Spencer
Gerard Sychay wrote: Hmm, the term vector does not have to consist of only term frequencies, does it? To give weight to rare terms, could you create a term vector of (TF*IDF) values for each term? Then, a distance function would measure how many terms two vectors have in common, giving weight to

Re: Range Query Sombody HELP please

2004-06-02 Thread Ype Kingma
On Wednesday 02 June 2004 14:46, Erik Hatcher wrote: On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: ... I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt) +filename:[b10181_p100 TO b10181_p200]

Re: Can I prevent Sort fields from influencing score?

2004-06-02 Thread Tim Jones
This seems like it would be determined by how you generate your query - if your query doesn't search in the sorted fields, they shouldn't affect the scoring of your documents ... -Original Message- From: Andy Goodell [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 02, 2004 12:22 PM

Re: Can I prevent Sort fields from influencing score?

2004-06-02 Thread Andy Goodell
thanks that was my problem, i had code extending the search out to all the fields, now it only extends the search out to the fields i'm interested in. - andy g On Wed, 2 Jun 2004 14:21:24 -0500 , Tim Jones [EMAIL PROTECTED] wrote: This seems like it would be determined by how you generate

RE: Can I prevent Sort fields from influencing score?

2004-06-02 Thread Gus Kormeier
Just curious, Are you building your query or using a particular Query Parser? which one? Are you using MultiFieldQueryParser? I had problems with MFQP before and was looking for other solutions besides dumping fields into a massive content field. TIA, -Gus -Original Message-

help needed in starting lucene

2004-06-02 Thread milind honrao
Hi, I am just a beginner. I installed lucene according to the intsructions provided. I did all the changed to the environment variables when i try to run the test program for building indexes using the following command: java org.apache.lucene.demo.IndexFiles test/Doc I am getting the

RE: help needed in starting lucene

2004-06-02 Thread wallen
It sounds to me like you need a newer version of Java. -Original Message- From: milind honrao [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 02, 2004 5:36 PM To: [EMAIL PROTECTED] Subject: help needed in starting lucene Hi, I am just a beginner. I installed lucene according to the

Re: Can I prevent Sort fields from influencing score?

2004-06-02 Thread Andy Goodell
I build the query myself, its really easy, I just use the normal query parser with IndexReader.getFieldNames(true) and loop through all of them to search everything at once. You can either make a really big BooleanQuery or make a bunch of small queries and merge the results, depending on what

Marten Senkel/IS/EUROPE/SIALEUROPE is out of the office.

2004-06-02 Thread Marten Senkel
I will be out of the office starting 2004-06-02 and will not return until 2004-06-04. Please contact Nicolas Guala-Molino for any request. Thanks!

building custom-stemmer

2004-06-02 Thread Musku, Anil (LA)
Hi, I have a fairly decent idea of using Lucene. I need to use it with some non-European, Indian and CJK languages. There are some languages among these that do not currently have a stemmer (I've looked in Snowball). I was wondering how I could write my own stemmer, say for e.g. for Hindi.

RE: a list of matching search term

2004-06-02 Thread Anson Lau
Thanks Erik I'll give that a try. Anson -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 02, 2004 7:28 PM To: Lucene Users List Subject: Re: a list of matching search term On Jun 1, 2004, at 9:19 PM, Anson Lau wrote: Further to my previous email:

RE: help needed in starting lucene

2004-06-02 Thread Karthik N S
Hey I think u have a file path problem in there try giving the full path java org.apache.lucene.demo.IndexFiles e:/lucene/../test/Doc Also set classpath for lucene1.3-final.jar or lucene-1.4-rc2.jar before start indexing with regards Karthik -Original Message- From: milind

problems with lucene in multithreaded environment

2004-06-02 Thread Jayant Kumar
We recently tested lucene with an index size of 2 GB which has about 1,500,000 documents, each document having about 25 fields. The frequency of search was about 20 queries per second. This resulted in an average response time of about 20 seconds approx per search. What we observed was that lucene

Re: problems with lucene in multithreaded environment

2004-06-02 Thread Doug Cutting
Jayant Kumar wrote: We recently tested lucene with an index size of 2 GB which has about 1,500,000 documents, each document having about 25 fields. The frequency of search was about 20 queries per second. This resulted in an average response time of about 20 seconds approx per search. That sounds

Range Query Sombody HELP please

2004-06-02 Thread Karthik N S
Hey Ype the Query of range +button +shirt +filename:[b10181_p100 TO b10181_p200] did not work for me but on other way around +(button OR shirt) +filename:[b10181_p100 TO b10181_p200] resulted to me in 2 hits with either one term button / shirt in each page,but not both of them