Oh, I've got it. The cause is the encoding issue. I must use utf-8 encoding parameter. And if this param is omitted, the index writer will only index a segment of the inputted text.
I hope never meet with this trouble again. Many thanks to you, Alexander. Best regards, David ----- Original Message ----- From: "Alexander Veremyev" <[EMAIL PROTECTED]> To: "Partout" <[EMAIL PROTECTED]> Cc: <[email protected]> Sent: Sunday, June 24, 2007 6:57 AM Subject: Re: [fw-general] Zend_search_lucene indexes not RIGHT? > Hi David, > > Could you give some additional information? > > 1. How much documents does index contain? For the case of Java Lucene > and Zend_Search Lucene. > > 2. Which analyzer do you use for Java Lucene indexing script? > > 3. Could you give an example of document, which is not included into > your "110 hits" result set? > > > PS Two recomendations: > 1. You use 'new Zend_Search_Lucene(...)' syntax for index creation. New > syntax (Zend_Search_Lucene::open(...) and > Zend_Search_Lucene::create(...)) is preferable. > > 2. Commit is not mandatory after each addDocument() call. It makes your > indexing script slower. > > > With best regards, > Alexander Veremyev. > > > Partout wrote: >> I have a test. >> >> I set up a folder with about 600 text files, and I search them by Windows >> Explorer Finder(and other tools,such as UEStudio) with word "shanghai", then >> I get 298 files. >> >> But when I use zend_search_lucene to index them, then search it, ONLY get >> about 110 hits! Luke tool box through the zend index also gets 110 hits. >> >> I use the demo index script from the Zend framework download. Both of 0.9.3 >> and 1.0.0 versions were used. But the same WRONG result. I don't know why? >> >> I also tested them with Lucene Java version, and found that it gets the >> right result, 298 hits. >> >> Below is my test script in php for creating index file: >> >> http://www.nabble.com/file/p11267602/CreateIndex.php CreateIndex.php >> >> BTW, I am confusing with the Zend_Search package. Comparing to Java Lucene, >> is it SO difficult for us to use? >> >> I hope someone could help me to find the solution. Thanks in advance. >> >> Best Regards, >> David >
