bongobongo wrote:
Please also see my previous post in this thread!
Have done some more testing:
If I do this:
$index->setMaxBufferedDocs(10000);
It lets to store 10000 documents in memory, then segment (which contains
10000 documents) is dumped to disk.
When segment file is written down it's included into segments list
('segments' file).
Updated segments file is written as 'segments.new' then it's renamed to
'segments'.
rename() behavior under Windows needs target file to be deleted first.
Otherwise it reports 'permission denied'
That looks like unlink() is delayed and executed _after_ rename() in
some cases...
Then all my 200 000 test records where indexed.
Tried it three times... and no errors.
You have only 20 segments written down with MaxBufferedDocs set to
10000. If MergeFactor is not changed (has default value 10), then
auto-optimization merges 10 segments into one two times.
So we have only 22 rename operations.
Takes a looong time though to get the records indexed.....
Each record has an average of 30 words and it takes approx 26 minutes to get
all indexed.
Anything I can do to make this go faster?
Make your own analyzer, which splits text into words as fast as possible.
Sometimes split()/explode()/... may be used, sometime you already has
word list as array (extend Zend_Search_Lucene_Field class for such fields).
My comp. is pretty fast and with fast disks as well.
I also tried to set MaxBufferedDocs to 2000, and then it normally bailed out
after approx 4000 records,
So problem arises after second rename operation at this time.
got a little bit further some times... but each one ended witht the FATAL
error.
Default setting allways give me the error.
At which record?
Is this just a windows bug or has other people seen this on linux as well.
I think it's the same bug, which described as ZF-561
(http://framework.zend.com/issues/browse/ZF-561).
The platform mentioned there is Linux.
Sebi wrote, that when he moved his system to Linux error comes out.
I think it's system independent but very unstable bug.
The problem is that I still can't reproduce it on my systems (both
Windows, and Linux).
With best regards,
Alexander Veremyev.