Re: Question about FieldInfos

Robert Kirchgessner Sun, 15 Jan 2006 14:31:57 -0800

Hello Marvin,

> Those are the easy ones.  There's more, but it would require a major
> rewrite.  If you're interested, perform a websearch for '"KinoSearch
> Merge Model"' to find the previous post I sent to this list on the
> subject.


I did it, thank you. As you write in your explanation of KinoSearch Model:

> I believe that it is considerably better for Perl; my  
> guess is that it's incrementally better for Java.

As I understand it, the main reasons for better performance of your
model in Perl are:

1. avoiding expensive operations (accessor methods, object creation)

2. using "(a set of) inverted documents" instead of "(a set of)
1-document segments"

While point 1. is less important in Java (Lucene) and C (what
I'm interested in), the second point might significantly improve
the performance of indexing (no ram-directory, no encoding/decoding
steps, no input/output streams, no unnesessary memory allocations, ...).

There was even a patch to that problem:

http://issues.apache.org/jira/browse/LUCENE-211

Is it of interest yet? I'm all up for this optimization as soon as
the next version of my C-library is running.

> ... Out of curiosity,
> does PHPLucene write Lucene-compatible indexes?  KinoSearch will only
> when the source data is pure ascii with no null bytes, since it
> defines a String as arbitrary data preceded by a VInt byte-count.

Yes, the binary format is fully compatible to that of Lucene, as
is the read/write/search logic. By the way, though the project
emerged as a lucene implementation in PHP I soon switched
to writing a pure C-library with a binding to PHP. Now its
mostly a C-project.

Robert

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Question about FieldInfos

Reply via email to