On Fri, Dec 19, 2008 at 05:48:51PM -0500, Ryan McKinley wrote:
> The other thing i am curious about is flexible indexing" -- i have not
> really followed what that means, but somewhere in there i got the idea
> we could implement an rtree directly...
I've been flogging the idea of an RTree index component and and building out
the scaffolding to hold such a thing in KinoSearch/Lucy. Here's a sketch of
how it might work (assume Java bindings):
public class MyArchitecture extends Architecture {
ArrayList<SegDataWriter> segDataWriters(InvIndex invindex, Segment segment)
{
ArrayList<SegDataWriter> writers = super.segDataWriters(invindex,
segment);
writers.add(new RTreeWriter(invindex, segment));
return writers;
}
ArrayList<SegDataReader> segDataReaders(InvIndex invindex, Segment segment)
{
ArrayList<SegDataReader> readers = super.segDataReaders(invindex,
segment);
readers.add(new RTreeReader(invindex, segment));
return readers;
}
}
public class MySchema extends Schema {
public MyArchitecture architecture() { return new MyArchitecture(); }
public MySchema() {
addField("title", "text");
addField("content", "text");
addField("location", new RTreeField());
}
public PolyAnalyzer analyzer() { return new PolyAnalyzer("en"); }
}
IndexWriter writer = new IndexWriter(new MySchema(), path);
The write-time linchpin is the extensible class SegDataWriter. It's used
internally by the public-facing primary indexing class (InvIndexer in KS,
probably will be named IndexWriter in Lucy). Here's some current C code for
SegWriter in KS:
SegWriter*
SegWriter_init(SegWriter *self, InvIndex *invindex, Segment *segment)
{
Schema *schema = InvIndex_Get_Schema(invindex);
Architecture *arch = Schema_Get_Architecture(schema);
SegDataWriter_init((SegDataWriter*)self, invindex, segment);
self->inverter = Inverter_new(InvIndex_Get_Schema(invindex));
self->writers = Arch_SegDataWriters(arch, invindex, segment);
return self;
}
void
SegWriter_add_inverted_doc(SegWriter *self, Inverter *inverter,
i32_t doc_num)
{
u32_t i;
Seg_Add_Inverted_Doc(self->segment, inverter, doc_num);
for (i = 0; i < self->writers->size; i++) {
SegDataWriter *writer = (SegDataWriter*)VA_Fetch(self->writers, i);
SegDataWriter_Add_Inverted_Doc(writer, inverter, doc_num);
}
}
IndexWriter.addDoc() calls SegWriter.addDoc(), which inverts the document and
calls SegWriter.addInvertedDoc(). SegWriter.addInvertedDoc() recurses down
into each of the SegWriter's children -- including, potentially, the
RTreeWriter.
Marvin Humphrey
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]