The problem isn't sorting per-se... the problem is quickly retrieving
the sort value for a document.  For that, we currently have the
FieldCache.... that's what takes up the memory. There are more memory
efficient ways, but they just haven't been implemented yet.

-Yonik

On Tue, Jul 29, 2008 at 3:05 PM, Marcus Herou
<[EMAIL PROTECTED]> wrote:
> Guys.
>
> I've noticed many having trouble with sorting and OOM. Eventually they solve
> it by throwing more memory at the problem.
>
> Should'nt a solution which can sort on disk when neccessary be implemented
> in core Lucene ?
> Something like this:
> http://www.codeodor.com/index.cfm/2007/5/10/Sorting-really-BIG-files/1194
>
> Since you obviously know the result size you can calculate how much memory
> is needed for the sort and if the calculated value s higher then a
> configurable threshold an external on disk sort is performed and perhaps a
> logging message which states something on a WARN level.
>
> Just a thought since I'm about to implement something which could sort any
> Comparable object but on disk.
>
> Guess the Hadoop project have the perfect tools for this since everything
> the mapred inputfiles are sorted, on disk and huge.
>
> Kindly
>
> //Marcus
>
>
> --
> Marcus Herou CTO and co-founder Tailsweep AB
> +46702561312
> [EMAIL PROTECTED]
> http://www.tailsweep.com/
> http://blogg.tailsweep.com/
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to