Nicolas Lalevée-2 wrote:
> 
> Le Samedi 13 Janvier 2007 10:49, Melange a écrit :
>> Hello, I'd like to index a web forum (phpBB) with Lucene. I wonder how to
>> best map the forum document model (topics and their messages) to the
>> Lucene
>> document model.
>>
>> Usually, some forum member creates a new topic with its first message
>> text,
>> then other members add reply messages to that topic. Messages are
>> sometimes
>> updated, but most of the time topics grow incrementally. There's no limit
>> for the number of replies, thousands is nothing unusual.
>>
>> Currently, I see two options for my Lucene data model: A single document
>> type or two document types (one for the topics and one for the messages).
>> When using only a single document type, things are fairly clear but there
>> would obviously be a lot of unneccessary index modifications (their would
>> be one field with all messages concatenated). To reduce the amount of
>> index
>> updates, the separation of topics and messages seems to be the right
>> thing
>> to do.
>>
>> So I'd like to use two document types for my document model, but I do not
>> understand how I could bring these two together when searching. I don't
>> want to list all messages but I want the messages grouped by topic, how
>> can
>> I go about that?
>>
>> The topic documents could be boosted, but perhaps that's not even
>> necessary
>> because of their relativly short length (compared to message documents).
> 
> Hi Melange,
> 
> The two document types design will be only usefull if you want to search
> for 
> topics and search for messages. Here you want to search for messages
> grouped 
> by topic. So you should have one kind of document : message documents. In 
> this message docment, you will refer the topic's id, so you will be able
> to 
> group by topic. To group by topic some search results, you might be 
> interested by Solr's [1] faceted search [2].
> 
> cheers,
> Nicolas
> 
> [1] http://incubator.apache.org/solr/
> [2] http://wiki.apache.org/solr/SimpleFacetParameters
> 

Thank you Nicolas, good idea with the message documents, I'll do that
instead.

Sorry, I couldn't really find anything at the Solr links you provided
regarding the grouping of search results (hits). Will I have to load all the
hits into RAM in order to perform the grouping myself or is there a way to
have Lucene do that for me? Or how is this to be done, roughly?

Thanks,
Christian.
-- 
View this message in context: 
http://www.nabble.com/Using-Lucene-to-index-a-web-forum-tf2970740.html#a8315049
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to