Use of Lucene to store data from RSS feeds

appy74 Thu, 14 Oct 2010 07:18:17 -0700

Hello

I would like to store data retrieved hourly from RSS feeds in a database or in 
Lucene so that the text can be easily
indexed for word frequencies.


I need to get the text from the title and description elements of RSS items.

Ideally, for each hourly retrieval from a given feed, I would add a row to a 
table in a dataset made up of the
following columns:

feed_url, title_element_text, description_element_text, polling_date_time

>From this, I can look up any element in a feed and calculate keyword 
>frequencies based upon the length of time required.

This can be done as a database table and hashmaps used to calculate word 
frequencies. But can I do this in Lucene to
this degree of granularity at all? If so, would each feed form a Lucene 
document or would each 'row' from the
database table form one?

Can anyone advise?

Thanks

Martin O'Shea.
-- 



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Use of Lucene to store data from RSS feeds

Reply via email to