Hi Alexander, I think this is a good idea to promote the usage of Zend Framework if you provide the documentation.
In fact, Zend_Search_Lucene needs more documentation. Last weekend I indexed about 12,000 files, and when I used the default index settings, it taked about 3 hours. When I changed the MaxBufferedDocs and MaxMergeDocs values both with 500, the indexing time decreased to about 40 minutes, which is just a little more than Java Lucene's indexing time. Thank you for your work. Best regards, David ----- Original Message ----- From: "Alexander Veremyev" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[email protected]> Sent: Tuesday, June 26, 2007 2:11 AM Subject: Re: [fw-general] Zend_search_lucene indexes not RIGHT? > Hi David, > > Good. > > I am working on Zend_Search_Lucene "Best practices" documentation > section now. I want to add encoding section there to describe problems > like this. > > > With best regards, > Alexander Veremyev. > > [EMAIL PROTECTED] wrote: >> Oh, I've got it. The cause is the encoding issue. >> >> I must use utf-8 encoding parameter. And if this param is omitted, the index >> writer will only index a segment of the inputted text. >> >> I hope never meet with this trouble again. >> >> Many thanks to you, Alexander. >> >> Best regards, >> David >> ----- Original Message ----- >> From: "Alexander Veremyev" <[EMAIL PROTECTED]> >> To: "Partout" <[EMAIL PROTECTED]> >> Cc: <[email protected]> >> Sent: Sunday, June 24, 2007 6:57 AM >> Subject: Re: [fw-general] Zend_search_lucene indexes not RIGHT? >> >> >>> Hi David, >>> >>> Could you give some additional information? >>> >>> 1. How much documents does index contain? For the case of Java Lucene >>> and Zend_Search Lucene. >>> >>> 2. Which analyzer do you use for Java Lucene indexing script? >>> >>> 3. Could you give an example of document, which is not included into >>> your "110 hits" result set? >>> >>> >>> PS Two recomendations: >>> 1. You use 'new Zend_Search_Lucene(...)' syntax for index creation. New >>> syntax (Zend_Search_Lucene::open(...) and >>> Zend_Search_Lucene::create(...)) is preferable. >>> >>> 2. Commit is not mandatory after each addDocument() call. It makes your >>> indexing script slower. >>> >>> >>> With best regards, >>> Alexander Veremyev. >>> >>> >>> Partout wrote: >>>> I have a test. >>>> >>>> I set up a folder with about 600 text files, and I search them by Windows >>>> Explorer Finder(and other tools,such as UEStudio) with word "shanghai", >>>> then >>>> I get 298 files. >>>> >>>> But when I use zend_search_lucene to index them, then search it, ONLY get >>>> about 110 hits! Luke tool box through the zend index also gets 110 hits. >>>> >>>> I use the demo index script from the Zend framework download. Both of 0.9.3 >>>> and 1.0.0 versions were used. But the same WRONG result. I don't know why? >>>> >>>> I also tested them with Lucene Java version, and found that it gets the >>>> right result, 298 hits. >>>> >>>> Below is my test script in php for creating index file: >>>> >>>> http://www.nabble.com/file/p11267602/CreateIndex.php CreateIndex.php >>>> >>>> BTW, I am confusing with the Zend_Search package. Comparing to Java Lucene, >>>> is it SO difficult for us to use? >>>> >>>> I hope someone could help me to find the solution. Thanks in advance. >>>> >>>> Best Regards, >>>> David >>> >
