Oh, I've got it. The cause is the encoding issue. 

I must use utf-8 encoding parameter. And if this param is omitted, the index 
writer will only index a segment of the inputted text. 

I hope never meet with this trouble again.

Many thanks to you, Alexander.

Best regards,
David
----- Original Message ----- 
From: "Alexander Veremyev" <[EMAIL PROTECTED]>
To: "Partout" <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Sunday, June 24, 2007 6:57 AM
Subject: Re: [fw-general] Zend_search_lucene indexes not RIGHT?


> Hi David,
> 
> Could you give some additional information?
> 
> 1. How much documents does index contain? For the case of Java Lucene 
> and Zend_Search Lucene.
> 
> 2. Which analyzer do you use for Java Lucene indexing script?
> 
> 3. Could you give an example of document, which is not included into 
> your "110 hits" result set?
> 
> 
> PS Two recomendations:
> 1. You use 'new Zend_Search_Lucene(...)' syntax for index creation. New 
> syntax (Zend_Search_Lucene::open(...) and 
> Zend_Search_Lucene::create(...)) is preferable.
> 
> 2. Commit is not mandatory after each addDocument() call. It makes your 
> indexing script slower.
> 
> 
> With best regards,
>    Alexander Veremyev.
> 
> 
> Partout wrote:
>> I have a test.
>> 
>> I set up a folder with about 600 text files, and I search them by Windows
>> Explorer Finder(and other tools,such as UEStudio) with word "shanghai", then
>> I get 298 files.
>> 
>> But when I use zend_search_lucene to index them, then search it, ONLY get
>> about 110 hits! Luke tool box through the zend index also gets 110 hits.
>> 
>> I use the demo index script from the Zend framework download. Both of 0.9.3
>> and 1.0.0 versions were used. But the same WRONG result. I don't know why?
>> 
>> I also tested them with Lucene Java version, and found that it gets the
>> right result, 298 hits.
>> 
>> Below is my test script in php for creating index file:
>> 
>> http://www.nabble.com/file/p11267602/CreateIndex.php CreateIndex.php 
>> 
>> BTW, I am confusing with the Zend_Search package. Comparing to Java Lucene,
>> is it SO difficult for us to use?
>> 
>> I hope someone could help me to find the solution. Thanks in advance.
>> 
>> Best Regards,
>> David
>

Reply via email to