The problem isn't sorting per-se... the problem is quickly retrieving the sort value for a document. For that, we currently have the FieldCache.... that's what takes up the memory. There are more memory efficient ways, but they just haven't been implemented yet.
-Yonik On Tue, Jul 29, 2008 at 3:05 PM, Marcus Herou <[EMAIL PROTECTED]> wrote: > Guys. > > I've noticed many having trouble with sorting and OOM. Eventually they solve > it by throwing more memory at the problem. > > Should'nt a solution which can sort on disk when neccessary be implemented > in core Lucene ? > Something like this: > http://www.codeodor.com/index.cfm/2007/5/10/Sorting-really-BIG-files/1194 > > Since you obviously know the result size you can calculate how much memory > is needed for the sort and if the calculated value s higher then a > configurable threshold an external on disk sort is performed and perhaps a > logging message which states something on a WARN level. > > Just a thought since I'm about to implement something which could sort any > Comparable object but on disk. > > Guess the Hadoop project have the perfect tools for this since everything > the mapred inputfiles are sorted, on disk and huge. > > Kindly > > //Marcus > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > [EMAIL PROTECTED] > http://www.tailsweep.com/ > http://blogg.tailsweep.com/ > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]