On Thu, Dec 20, 2007 at 03:47:42PM +0100, Marc Weeber wrote:
> hi all,
>
> I think you're right. The field to sort on is a date field in the
> string format of YYYY-MM-DD. I indeend started looking into the java
> sorting things, and I am not too much surprised any more of the memory
> load. Good thing is that after the first search+sort, it is *really*
> fast: a cooccurrence search (two terms per doc in a boolean query)
> together with a sort on date in the 50M collection is between 50ms and
> 200ms (timed in python, before and after the search) , with no real
> difference between jcc and gcc scripts
>
> >
> >
> >If you have a lot of dead space (reader.maxDoc() >> reader.numDocs()),
> >optimizing should decrease memory usage.
> do you mean a .optimize() on the index? That I already have done. Or
> do you mean something different?
Just optimize() on the index. If you've done that, then maxDoc() should
be equal to numDocs().
Aaron
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev