Re: Patchs for RussianAnalyzer

2004-03-30 Thread Vladimir Yuryev
Erik, Look, please second my letter whithout attachment. It has the texts in body letter. Vladimir. On Mon, 29 Mar 2004 12:06:45 -0500 Erik Hatcher [EMAIL PROTECTED] wrote: Vladimir, I have just taken a look at your submitted patches. I have no objections to making Cp1251 the default charset

Re: Patchs for RussianAnalyzer

2004-03-30 Thread Erik Hatcher
On Mar 30, 2004, at 3:38 AM, Vladimir Yuryev wrote: Erik, Look, please second my letter whithout attachment. It has the texts in body letter. Vladimir. I don't have that e-mail you refer to. Please use the standard Jakarta Bugzilla issue tracking system, though. You can place an attachment to

Re: Re: UNIX command-line indexing script?

2004-03-30 Thread Linto Joseph Mathew
charlie, i wrote this in java.Ofcourse I am ready to share. But i have some problems when indexing large volume of data. I am under testing. Linto On Fri, 26 Mar 2004 Charlie Smith wrote : So, Linto, Did you write this in PERL or JAVA. Would you be willing to part with copy of source?

Re: Patchs for RussianAnalyzer

2004-03-30 Thread Vladimir Yuryev
Erik, I made BUG # 28050. Vladimir On Tue, 30 Mar 2004 06:19:04 -0500 Erik Hatcher [EMAIL PROTECTED] wrote: On Mar 30, 2004, at 3:38 AM, Vladimir Yuryev wrote: Erik, Look, please second my letter whithout attachment. It has the texts in body letter. Vladimir. I don't have that e-mail you refer

Re: Lucene optimization with one large index and numerous small indexes.

2004-03-30 Thread Doug Cutting
Esmond Pitt wrote: Don't want to start a buffer size war, but these have always seemed too small to me. I'd recommend upping both InputStream and OutputStream buffer sizes to at least 4k, as this is the cluster size on most disks these days, and also a common VM page size. Okay. Reading and

Re: too many files open error

2004-03-30 Thread Charlie Smith
Thanks for the information. I downloaded 1.3-rc2 and put a IndexReader.close() at the end of the search routine. This seems to have cleared up the problems. Also, demo source code for results.jsp to return a pointer to IndexReader so that it could be closed at end of search. Ie.

Re: Lucene 1.4 - lobby for final release

2004-03-30 Thread Charlie Smith
Your opinion of course on the issued of too many files open not being a bug. I found it to be otherwise. Thanks for the info on popular elections. Being a newbie to this list, I am finding that most others on the list a bit more pleasant. But then, you not up for a popular election, are

The Filter got called more than one time

2004-03-30 Thread Ching-Pei Hsing
Hi, We implemented a Filter that performs filtering based on some internal pricing logic. While testing we discovered that this filter got called several times, not like the FAQ says, exactly one time. And the number of calls made was based on how big the result set was. I printed out the calling

Re: The Filter got called more than one time

2004-03-30 Thread Erik Hatcher
Use a caching mechanism for your filter, so the bitset is not regenerated. CachingWrappingFilter is your friend :) Erik On Mar 30, 2004, at 2:28 PM, Ching-Pei Hsing wrote: Hi, We implemented a Filter that performs filtering based on some internal pricing logic. While testing we discovered

[patch] MultiSearcher should support getSearchables()

2004-03-30 Thread Kevin A. Burton
Seems to only make sense to allow a caller to find the searchables a MultiSearcher was created with: 'diff' -uN MultiSearcher.java.bak MultiSearcher.java --- MultiSearcher.java.bak 2004-03-30 14:57:41.660109642 -0800 +++ MultiSearcher.java 2004-03-30 14:57:46.530330183 -0800 @@ -208,4

Near performance question

2004-03-30 Thread Joe Paulsen
Based on the nature of our documents, we sometimes experience extremely long response times when executing NEAR operations against a document (sometimes well over minutes - even though the operation is restricted to a single document). Our analysis of the code indicates (we think): It looks up

Performance of hit highlighting and finding term positions for a specific document

2004-03-30 Thread Kevin A. Burton
I'm playing with this package: http://home.clara.net/markharwood/lucene/highlight.htm Trying to do hit highlighting. This implementation uses another Analyzer to find the positions for the result terms. This seems that it's very inefficient since lucene already knows the frequency and

Re: [patch] MultiSearcher should support getSearchables()

2004-03-30 Thread Erik Hatcher
On Mar 30, 2004, at 5:59 PM, Kevin A. Burton wrote: Seems to only make sense to allow a caller to find the searchables a MultiSearcher was created with: Could you elaborate on why it makes sense? What if the caller changed a Searchable in the array? Would anything bad happen? (I don't know,

Re: Performance of hit highlighting and finding term positions for a specific document

2004-03-30 Thread Erik Hatcher
On Mar 30, 2004, at 7:56 PM, Kevin A. Burton wrote: Trying to do hit highlighting. This implementation uses another Analyzer to find the positions for the result terms. This seems that it's very inefficient since lucene already knows the frequency and position of given terms in the index. What

Re: Performance of hit highlighting and finding term positions for a specific document

2004-03-30 Thread Stephane James Vaucher
I agree with you that a highlight package should be available directly from the lucene website. To offer this much-desired feature, having a dependency on a personal web site seems a little weird to me. It would also force the community to support this functionality, which would seem

Re: Performance of hit highlighting and finding term positions for a specific document

2004-03-30 Thread Kevin A. Burton
Erik Hatcher wrote: On Mar 30, 2004, at 7:56 PM, Kevin A. Burton wrote: Trying to do hit highlighting. This implementation uses another Analyzer to find the positions for the result terms. This seems that it's very inefficient since lucene already knows the frequency and position of given

Re: [patch] MultiSearcher should support getSearchables()

2004-03-30 Thread Kevin A. Burton
Erik Hatcher wrote: On Mar 30, 2004, at 5:59 PM, Kevin A. Burton wrote: Seems to only make sense to allow a caller to find the searchables a MultiSearcher was created with: Could you elaborate on why it makes sense? What if the caller changed a Searchable in the array? Would anything bad

Re: Performance of hit highlighting and finding term positions for a specific document

2004-03-30 Thread Bruce Ritchie
Kevin A. Burton wrote: I'm playing with this package: http://home.clara.net/markharwood/lucene/highlight.htm Trying to do hit highlighting. This implementation uses another Analyzer to find the positions for the result terms. This seems that it's very inefficient since lucene already knows

Re: [patch] MultiSearcher should support getSearchables()

2004-03-30 Thread Erik Hatcher
On Mar 30, 2004, at 8:52 PM, Kevin A. Burton wrote: Erik Hatcher wrote: On Mar 30, 2004, at 5:59 PM, Kevin A. Burton wrote: Seems to only make sense to allow a caller to find the searchables a MultiSearcher was created with: Could you elaborate on why it makes sense? What if the caller