Re: [lucy-user] Large index sizes

Edwin Crockford Thu, 25 Apr 2013 07:43:58 -0700

Hi Bob,

Many thanks for the quick reply, it looks like we will have to beef upthe machine a bit. Currently the largest index we have successfullybuilt is 2G so still along ways below your figures. I notice there is afeature to search multiple indexes simultaneously(Lucy::Search::PolySearcher). Is this a possible way around our resourceissue, split the index into small ones and then do a polysearch acrossthem all, or is there a noticeable performance hit?


Regards
Edwin

On 25/04/2013 13:16, Bob Bruen wrote:

Hi,
I have indexed millions of files, ending up with a 127G index file,which works fine. There are enough resources for this.
I also tried to do the same with 10s of millions, but the indexingprocess never could finish, even with enough resources (index file~400G). It kept updating one file a tiny bit every few minutes. Ithink I could do a better job in the code, but I have not been able toget back to it yet.
            -bob


On Thu, 25 Apr 2013, Edwin Crockford wrote:
Have recently built started to use Lucy (with Perl) and everythingwent well until I tried to index a large file store (>300,000 files).The indexer process reached >8Bbytes and the machine ran out ofresources. My questions are:
a) Is this the normal resources requirements?

b) Is there a way to avoid swamping machines?
I also found that the searcher becomes very large for large indexesand as ours runs as a part of a FastCGI process it exceeded theulimit of the process. Upping the ulimit fixed this, but diagnosingthe issue was difficult as the query would just return 0 resultsrather than indicating that it had run out of procees space.
Many thanks

Edwin Crockford

Re: [lucy-user] Large index sizes

Reply via email to