Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how to
best map the forum document model (topics and their messages) to the Lucene
document model.

Usually, some forum member creates a new topic with its first message text,
then other members add reply messages to that topic. Messages are sometimes
updated, but most of the time topics grow incrementally. There's no limit
for the number of replies, thousands is nothing unusual.

Currently, I see two options for my Lucene data model: A single document
type or two document types (one for the topics and one for the messages).
When using only a single document type, things are fairly clear but there
would obviously be a lot of unneccessary index modifications (their would be
one field with all messages concatenated). To reduce the amount of index
updates, the separation of topics and messages seems to be the right thing
to do.

So I'd like to use two document types for my document model, but I do not
understand how I could bring these two together when searching. I don't want
to list all messages but I want the messages grouped by topic, how can I go
about that?

The topic documents could be boosted, but perhaps that's not even necessary
because of their relativly short length (compared to message documents).

Thanks,
Melange.
-- 
View this message in context: 
http://www.nabble.com/Using-Lucene-to-index-a-web-forum-tf2970740.html#a8312744
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to