I am very grateful for all your reply and very sorry for my late response.
You can see that I posted my message twice, because I didn't see it after I
posted first and thought it wouldn't appear in the list. So these days I
didn't check my gmail box.  I have figured out that problem. The index was a
mixture of two kinds of index types  that were comprised of  .cfs and
(.tis,.tii.and so on).

 >I'll second Otis' request about the special segmentation api.  If it
is open source, I'd love to tinker with it.  中文是不太难。  :)

The api is open source, I can distribute it to you. I am glad You understand
Chinese. How I should deliver it to you?  Because the api includes a Chinese
lexis which is nearly 10M in size. Maybe I can mail it to you.



2006/5/30, Erik Hatcher <[EMAIL PROTECTED]>:


On May 29, 2006, at 6:34 AM, hu andy wrote:
> I indexed a collection of Chinese documents. I use a special
> segmentation
> api to do the analysis, because the segmentation of Chinese is
> different
> from English.

I'll second Otis' request about the special segmentation api.  If it
is open source, I'd love to tinker with it.  中文是不太难。  :)

> A strange thing happened.   With lucene 1.4 or lucene 2.0, it will
> be all
> right to retrieve the corresponding documents given the terms that
> exist in
> the index  *.tis file(I wrote a program to pick the terms from
> the .tis file
> and search them).  But with 1.9, for some terms that existed in the
> index, I
> couldn't retrieve the corresponding document.
>
> Can anybody give me some advice about this? Thank you in advance.

If you can share an example that demonstrates an issue, we'd love to
have it and incorporate it into our test suite and fix the
implementation if a bug exists.   A working example of a bug can get
fixed much easier than looking for needles in a haystack.

       Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to