Re: Uneffective writeBytes and readBytes [FIX]

2005-09-21 Thread Lukas Zapletal
Dne Thu, 08 Sep 2005 14:17:57 -0700 Doug Cutting <[EMAIL PROTECTED]> napsal(a): > This forces a flush() each time a byte array of any size is written. Oh I didnt scroll down the message, I have read this today. I see, I thought Lucene uses writeByte instead. I will reimplement the patch, sure.

[jira] Commented: (LUCENE-435) [PATCH] BufferedIndexOutput - optimized writeBytes() method

2005-09-21 Thread Lukas Zapletal (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-435?page=comments#action_12330074 ] Lukas Zapletal commented on LUCENE-435: --- Will fix this. > [PATCH] BufferedIndexOutput - optimized writeBytes() method > -

Re: Lucene JIRA e-mail notifications

2005-09-21 Thread Erik Hatcher
On Sep 20, 2005, at 4:09 AM, Lukas Zapletal wrote: Dne Sun, 18 Sep 2005 05:56:24 -0400 Erik Hatcher <[EMAIL PROTECTED]> napsal(a): Lucene was recently converted from Bugzilla to JIRA. I'm unfamiliar with JIRA's administrative options, but I see that I have administrative capability on our i

[jira] Created: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2005-09-21 Thread kieran (JIRA)
[PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception Key: LUCENE-436 URL: http://issues.apache.org/jira/browse/LUCENE-436 Project: Lucene - Java Type: Improvement Components: Index V

searching problem

2005-09-21 Thread haipeng du
how could I search lucene that contains words with specified suffix? Such as to get all documents that "filename" ends with ".pdf". Thanks a lot. -- Haipeng Du Software Engineer Comphealth, Salt Lake City

Re: Lucene and UTF-8

2005-09-21 Thread Marvin Humphrey
On Sep 20, 2005, at 11:53 PM, Chris Lamprecht wrote: import java.util.Arrays; ... Arrays.equals(array1, array2); Great, thank you, Chris. The patch for IndexOutput.java is done. It will now write valid UTF-8. Older versions of Lucene will not be able to read indexes written using this

Re: Lucene and UTF-8

2005-09-21 Thread Yonik Seeley
How does this patch work w.r.t. the length vint? It looks like the length is still the number of 16 bit java chars, but the encoding is now correct UTF-8? -Yonik Now hiring -- http://tinyurl.com/7m67g On 9/21/05, Marvin Humphrey <[EMAIL PROTECTED]> wrote: > > On Sep 20, 2005, at 11:53 PM, Chris

Re: Lucene and UTF-8

2005-09-21 Thread Marvin Humphrey
On Sep 21, 2005, at 12:25 PM, Yonik Seeley wrote: How does this patch work w.r.t. the length vint? It looks like the length is still the number of 16 bit java chars, but the encoding is now correct UTF-8? Yes. As Ken Krugler pointed out to me, the issues can be separated. The length VInt

[jira] Created: (LUCENE-437) SnowballFilter loses token position offset

2005-09-21 Thread Yonik Seeley (JIRA)
SnowballFilter loses token position offset -- Key: LUCENE-437 URL: http://issues.apache.org/jira/browse/LUCENE-437 Project: Lucene - Java Type: Bug Components: Analysis Versions: CVS Nightly - Specify date in submission

[jira] Updated: (LUCENE-437) SnowballFilter loses token position offset

2005-09-21 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-437?page=all ] Yonik Seeley updated LUCENE-437: Attachment: yonik_snowballfix.txt "svn diff" for the testcase and patch. Q: for future patches, should the diffs be in one file or multiple files? > Snowball

incorrect score normalization in hits

2005-09-21 Thread Yonik Seeley
Hits does normalization based on the score of the first document "scoreDocs[0].score" This is a problem if sort is on anything other than score, since the first document won't necessarily be the highest scoring. I propose fixing this by adding a field to TopDocs called "maxScore", and using that i

[jira] Commented: (LUCENE-432) Make FieldSortedHitQueue public

2005-09-21 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-432?page=comments#action_12330155 ] Yonik Seeley commented on LUCENE-432: - I'd like to see this patch applied. Right now I have to work around this by having my own org.apache.lucene.search package with the

[jira] Assigned: (LUCENE-437) SnowballFilter loses token position offset

2005-09-21 Thread Erik Hatcher (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-437?page=all ] Erik Hatcher reassigned LUCENE-437: --- Assign To: Erik Hatcher > SnowballFilter loses token position offset > -- > > Key: LUCENE-437 >

UTF-8 and unit test failure for org.apache.analysis.ru.RussianStem in build with Kaffe

2005-09-21 Thread Barry Hawkins
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Guys, Hello, it's those pesky Debian Lucene package maintainers again :-). Lucene currently builds and passes all but one unit test against Kaffe[0] 1.1.6. In debugging the failure of the unit test for org.apache.analysis.ru.RussianStem, I enable

Can any one help

2005-09-21 Thread santosh
Hi, Below is the problem I am facing when I am trying to get the index file from my production box to local. I got this Exception: org.apache.lucene.search.BooleanQuery$TooManyClauses at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:78) This index file contains near