Hello guys,
I read your messages and I'm sorry you couldn't find a solution. I had the
same problems when I tried to index my documents. you can see some of these
errors here:
http://www.nabble.com/Zend_Search_Lucene-errors-tf3205213s16154.html#a8900427.
I had this problem only on WinXP system. I want to make the following
observations:
- When AntiVirus is running the errors appears more often.
- When I place a wait time between consecutive functions calls that
access the index (delete(), find(), addDocument()) the probability to index
all documents is increased. I actually succeeded to index 8000 documents
using a wait time between consecutive functions calls.
Alexander, I have a little question regard the search optimizations. Why the
second query so bad search times compared with the first one?
1. +(((titleSrch:arte)) ((descriptionSrch:arte)) ((tagsSrch:arte))) is made
in 0.077 sec
2. +(((titleSrch:arte)) ((descriptionSrch:arte)) ((tagsSrch:arte)))
+(countryID:1) is made in 0.50 sec
The difference is huge. What do you think?
Alexander Veremyev wrote:
>
> Hi,
>
> As I've checked, you don't need ...Common_Text class. All you need is
> Zend_Search_Lucene_Analysis_Analyzer:
>
> -- myCoolAnalyzer.php ----------------------------------
> <?php
> /** Zend_Search_Lucene_Analysis_Analyzer */
> require_once 'Zend/Search/Lucene/Analysis/Analyzer.php';
>
> class myCoolAnalyzer extends Zend_Search_Lucene_Analysis_Analyzer
> {
> /**
> * Word list
> *
> * @var array
> */
> private $_wordList;
>
> /**
> * Reset token stream
> */
> public function reset()
> {
> $this->_wordList = explode(' ', $this->_input);
> reset($this->_wordList);
> }
>
> /**
> * Tokenization stream API
> * Get next token
> * Returns null at the end of stream
> *
> * Tokens are returned in UTF-8 (internal Zend_Search_Lucene
> encoding)
> *
> * @return Zend_Search_Lucene_Analysis_Token|null
> */
> public function nextToken()
> {
> if (($word = current($this->_wordList)) === false) {
> return null;
> }
> next($this->_wordList);
>
> return new
> Zend_Search_Lucene_Analysis_Token(iconv($this->_encoding, 'UTF-8',
> $word), -1, -1 /* dummy start/end positions */);
> }
> }
> --------------------------------------------------------
>
> -- myCode.php ----------------------------------
> /** Zend_Search_Lucene_Analysis_Analyzer */
> require_once 'myCoolAnalyzer.php';
>
> Zend_Search_Lucene_Analysis_Analyzer::setDefault(new myCoolAnalyzer());
> --------------------------------------------------------
>
>
> PS One of standard analyzers has to be used while searching because this
> functionality (explode()) may not work for user entered queries.
>
> With best regards,
> Alexander Veremyev.
>
>
>
> bongobongo wrote:
>>
>>
>> Alexander Veremyev wrote:
>>>
>>> Because you have words in the db records already prepared for indexing
>>> :)
>>> Default analyzer walks through the string char by char, checks if it's
>>> letter/digit or not and so on. Than it applies lower case filter.
>>>
>>> You don't need these. Take a look on
>>> Zend_Search_Lucene_Analysis_Analyzer_Common_Text class. You may make it
>>> much more effective with explode() function for your case.
>>>
>>
>> Sounds like a good idea.
>>
>> Could you please show some example code on how to do that?
>>
>> How to use the ...Common_Text class together with the
>> explode() function....?
>>
>> Regards
>>
>>
>>
>
>
>
--
View this message in context:
http://www.nabble.com/zend_search_lucene%3A-Is-this-a-bug--Error-after-adding-x-number-of-documents--%28ver.-0.8.0%29-tf3309996s16154.html#a9318427
Sent from the Zend Framework mailing list archive at Nabble.com.