Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how to best map the forum document model (topics and their messages) to the Lucene document model.
Usually, some forum member creates a new topic with its first message text, then other members add reply messages to that topic. Messages are sometimes updated, but most of the time topics grow incrementally. There's no limit for the number of replies, thousands is nothing unusual. Currently, I see two options for my Lucene data model: A single document type or two document types (one for the topics and one for the messages). When using only a single document type, things are fairly clear but there would obviously be a lot of unneccessary index modifications (their would be one field with all messages concatenated). To reduce the amount of index updates, the separation of topics and messages seems to be the right thing to do. So I'd like to use two document types for my document model, but I do not understand how I could bring these two together when searching. I don't want to list all messages but I want the messages grouped by topic, how can I go about that? The topic documents could be boosted, but perhaps that's not even necessary because of their relativly short length (compared to message documents). Thanks, Melange. -- View this message in context: http://www.nabble.com/Using-Lucene-to-index-a-web-forum-tf2970740.html#a8312744 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]