Re: Out of memory exception for big indexes

2007-04-24 Thread Artem Vasiliev
Hello Ivan! It's so sad to me that you had bad results with that patch. :) The discussion in the ticket is out-of-date - the patch was initially in several classes, used WeakHashMap but then it evolved to what it's now - one StoredFieldSortFactory class. I use it in my sharehound app in pretty

Re: Out of memory exception for big indexes

2007-04-24 Thread Artem Vasiliev
. Regards, Artem On 4/24/07, Artem Vasiliev [EMAIL PROTECTED] wrote: Hello Ivan! It's so sad to me that you had bad results with that patch. :) The discussion in the ticket is out-of-date - the patch was initially in several classes, used WeakHashMap but then it evolved to what it's now - one

Re: Out of memory exception for big indexes

2007-04-24 Thread Artem Vasiliev
Hi Ivan! btw may be forbidding the sorted search in case of too many results is an option? I did this way in my case. Regards, Artem. On 4/24/07, Artem Vasiliev [EMAIL PROTECTED] wrote: Ahhh, you said in your original post that your search matches _all_ the results.. Yup my patch

Re: Large index question

2006-10-13 Thread Artem Vasiliev
Hello Scott! I think your index is just not large really. My Sharehound's indexes of my corporate LAN is about 10G/10mlns of (really small) documents now, and queries get really little time, less than a second for non-sorted queries and some more for sorted. The machine is some P4 with 1G RAM. I

Fwd: Re[2]: Fwd: Re[2]: 30 milllion+ docs on a single server

2006-10-04 Thread Artem Vasiliev
, I'm interested in it :) OG I'm also interested in a patch or contribution for Lucene itself. OG Please follow-up on java-user or java-dev. OG Thanks, OG Otis OG - Original Message OG From: Artem Vasiliev [EMAIL PROTECTED] OG To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED

Re[2]: 30 milllion+ docs on a single server

2006-08-21 Thread Artem Vasiliev
Hi guys! I have noticed many questions on the list vonsidering Lucene sorting memory consumption and hope my solution can help someone. I faced a memory/time consumption problem on sorting in Lucene back in April. With a help of this list's experts I came to solution which I like: documents from

Re[4]: OutOfMemory with search(Query, Sort)

2006-04-04 Thread Artem Vasiliev
Hello Hoss, Thanks for your answer, you're right, filepathes are pretty much unique. Anyway I don't want this total-field-cache-loading situation occur in any circumstances - it's too expensive. My app usually crawls while user searches are performed. Crawl involves additions and deletions so

Re[4]: OutOfMemory with search(Query, Sort)

2006-04-04 Thread Artem Vasiliev
I tried to sort by filePath field which can be 100 bytes at average meaning 400M RAM for the cache YS For string sorting, a FieldCache.StringIndex is used. YS It contains a sorted String[num_unique_terms_in_field], and an int[maxDoc] YS So if 10 documents share a large string field value, that

Re[2]: OutOfMemory with search(Query, Sort)

2006-04-01 Thread Artem Vasiliev
Hello Yonik, Thanks, it explains my issue and that's definitely a hit - I tried to sort by filePath field which can be 100 bytes at average meaning 400M RAM for the cache + IO excess to load them from 3G index. I wish this caching were configurable as lazy or switched off, do you know if that's

another lucene-based application

2006-03-17 Thread Artem Vasiliev
Hi guys! I'd like to thank the developers and contributors of Lucene project for the fantastic library. And thanks Otis and Erik for a great book! I'm writing an open source file searcher application 'sharehound' (http://sharehound.sourceforge.net/) based on Lucene. It can now search SMB file

Re[2]: another lucene-based application

2006-03-17 Thread Artem Vasiliev
Hello Xia, XD what's the difference from dotLucene? Why dotLucene? dotLucene is the .Net port of Lucene, so your question is pretty much the same as 'what's the difference from Lucene?' dotLucene as Lucene itself is not a search application, it's a library, so that's the difference :). Some of