On 10/10/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 10/10/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Maybe I missed it, but I was surprised that nobody here wondered about the 
algorithm and data structure changes that Dave Balmain made in Ferret, to make it 
go faster (than Java Lucene).

Not using single doc segments for buffered docs has come up
http://www.nabble.com/-jira--Created%3A-%28LUCENE-565%29-Supporting-deleteDocuments-in-IndexWriter-%28Code-and-Performance-Results-Provided%29-tf1580652.html#a6177808

After reading the interview article, I thought not using single doc
segments contributed most of the indexing performance improvement. But
in the mailing list discussion on "Global field semantics", Dave
Balmain mentioned most of the indexing performance benefits come from
having constant field numbers, which greatly optimizes the merging of
term vectors and stored fields.

Exactly how much performance improvement each of these two
optimizations provides will depend on a workload. But in general, is
one playing a more significant role than the other? What about for the
benchmark workload Yonik pointed out at
http://rubyforge.org/forum/forum.php?forum_id=9058 ?

Cheers,
Ning

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to