Re: Lucene introduction in Chinese

2002-09-12 Thread Che Dong
Otis Gospodnetic wrote: I think we should add this to the contribution page or some other place on the Lucene site (I'll take a look in a bit). I would like to just add a link to it. I think we should add this directly to the Lucene site. Lucene strives to be an

DO NOT REPLY [Bug 12569] New: - Uppercase/lowercase distinction in GermanStemmer not sustainable

2002-09-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12569. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.

Re: Query Rewriting

2002-09-12 Thread Clemens Marschner
Please submit diffs. Yeah, I'll do that on the weekend, have to get the latest from CVS (no fast Internet connection today). My IDE converts tabs to spaces when saving. Severe? @@ -151,7 +151,7 @@ Term term = enum.term(); if (term != null term.field()

Re: fixed url and How to contribute code to lucene sandbox?

2002-09-12 Thread Alex Murzaku
I don't know any Asian languages but from earlier experimentations, I remember that some time bigram tokenization could hurt matching, e.g.: w1w2w3 == tokenized as == w1w2 w2w3 (or even _w1 w1w2 w2w3 w3_) would miss a search for w2. w1 w2 w3 would work better. --- Doug Cutting [EMAIL PROTECTED]

Re: Query Rewriting

2002-09-12 Thread Otis Gospodnetic
Good IDE :) Spaces are healthier. We should standardize and require that, actually. Otis --- Clemens Marschner [EMAIL PROTECTED] wrote: Please submit diffs. Yeah, I'll do that on the weekend, have to get the latest from CVS (no fast Internet connection today). My IDE converts tabs to

TR : Possible Bug with MultiSearcher?

2002-09-12 Thread Rasik Pandey
Developers, Ok this is the latest test program to reproduce the original problem reported. When I submitted the first test program, I was unable to reproduce the original, however this new test program reproduces both the original error and the second error which I reported in my last mail. I

RE : Nullpointer in code

2002-09-12 Thread Rasik Pandey
Otis, Here a the test program that will generate the null pointer and the requested diff. Hope they are helpful. Thanks, Rasik Pandey -Message d'origine- De : Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Envoyé : jeudi 12 septembre 2002 18:23 À : Lucene Developers List Cc : [EMAIL

DO NOT REPLY [Bug 12588] New: - Delete failed after new Term is indexed

2002-09-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12588. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.

DO NOT REPLY [Bug 12588] - Delete failed after new Term is indexed

2002-09-12 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12588. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.

about bigram based word segment

2002-09-12 Thread Che Dong
I don't know any Asian languages but from earlier experimentations, I remember that some time bigram tokenization could hurt matching, e.g.: w1w2w3 == tokenized as == w1w2 w2w3 (or even _w1 w1w2 w2w3 w3_) would miss a search for w2. w1 w2 w3 would work better. if Chinese segment

Re: Lucene introduction in Chinese

2002-09-12 Thread Otis Gospodnetic
--- Doug Cutting [EMAIL PROTECTED] wrote: Otis Gospodnetic wrote: I think we should add this to the contribution page or some other place on the Lucene site (I'll take a look in a bit). I would like to just add a link to it. I think we should add this directly to the Lucene site.