Hi Alexander,

I think this is a good idea to promote the usage of Zend Framework if you 
provide the documentation. 

In fact,  Zend_Search_Lucene needs more documentation. Last weekend I indexed 
about 12,000 files, and when I used the default index settings, it taked about 
3 hours. When I changed the MaxBufferedDocs and MaxMergeDocs values both with 
500, the indexing time decreased to about 40 minutes, which is just a little 
more than Java Lucene's indexing time.

Thank you for your work.

Best regards,
David

----- Original Message ----- 
From: "Alexander Veremyev" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Tuesday, June 26, 2007 2:11 AM
Subject: Re: [fw-general] Zend_search_lucene indexes not RIGHT?


> Hi David,
> 
> Good.
> 
> I am working on Zend_Search_Lucene "Best practices" documentation 
> section now. I want to add encoding section there to describe problems 
> like this.
> 
> 
> With best regards,
>    Alexander Veremyev.
> 
> [EMAIL PROTECTED] wrote:
>> Oh, I've got it. The cause is the encoding issue. 
>> 
>> I must use utf-8 encoding parameter. And if this param is omitted, the index 
>> writer will only index a segment of the inputted text. 
>> 
>> I hope never meet with this trouble again.
>> 
>> Many thanks to you, Alexander.
>> 
>> Best regards,
>> David
>> ----- Original Message ----- 
>> From: "Alexander Veremyev" <[EMAIL PROTECTED]>
>> To: "Partout" <[EMAIL PROTECTED]>
>> Cc: <[email protected]>
>> Sent: Sunday, June 24, 2007 6:57 AM
>> Subject: Re: [fw-general] Zend_search_lucene indexes not RIGHT?
>> 
>> 
>>> Hi David,
>>>
>>> Could you give some additional information?
>>>
>>> 1. How much documents does index contain? For the case of Java Lucene 
>>> and Zend_Search Lucene.
>>>
>>> 2. Which analyzer do you use for Java Lucene indexing script?
>>>
>>> 3. Could you give an example of document, which is not included into 
>>> your "110 hits" result set?
>>>
>>>
>>> PS Two recomendations:
>>> 1. You use 'new Zend_Search_Lucene(...)' syntax for index creation. New 
>>> syntax (Zend_Search_Lucene::open(...) and 
>>> Zend_Search_Lucene::create(...)) is preferable.
>>>
>>> 2. Commit is not mandatory after each addDocument() call. It makes your 
>>> indexing script slower.
>>>
>>>
>>> With best regards,
>>>    Alexander Veremyev.
>>>
>>>
>>> Partout wrote:
>>>> I have a test.
>>>>
>>>> I set up a folder with about 600 text files, and I search them by Windows
>>>> Explorer Finder(and other tools,such as UEStudio) with word "shanghai", 
>>>> then
>>>> I get 298 files.
>>>>
>>>> But when I use zend_search_lucene to index them, then search it, ONLY get
>>>> about 110 hits! Luke tool box through the zend index also gets 110 hits.
>>>>
>>>> I use the demo index script from the Zend framework download. Both of 0.9.3
>>>> and 1.0.0 versions were used. But the same WRONG result. I don't know why?
>>>>
>>>> I also tested them with Lucene Java version, and found that it gets the
>>>> right result, 298 hits.
>>>>
>>>> Below is my test script in php for creating index file:
>>>>
>>>> http://www.nabble.com/file/p11267602/CreateIndex.php CreateIndex.php 
>>>>
>>>> BTW, I am confusing with the Zend_Search package. Comparing to Java Lucene,
>>>> is it SO difficult for us to use?
>>>>
>>>> I hope someone could help me to find the solution. Thanks in advance.
>>>>
>>>> Best Regards,
>>>> David
>>>
>

Reply via email to