Hey Paul, while I was writing the proposal for the project I was considering some CVS/SVN component for the gdata server which is predestinated for doing such a job. I will startup with the simplest approach. I will store the entries inside the lucene index as yonik and chuck have described. Later in development I will think about integrating svn as a versioning component, I might be worth a discussion to split the svn search component as another contrib project.
Good stuff, yonik you where right!! Hey any Atom / RSS professionals out there giving me a hand with some feed specific questions. simon On 5/28/06, Paul Elschot <[EMAIL PROTECTED]> wrote:
On Sunday 28 May 2006 01:33, Simon Willnauer wrote: > For those who haven't heard about the GData project please check > today's mailing list . > The Lucene Indexer is supposed to be used as the search component of > this implementation. As GData is an extension to the Atom/Rss format > including search and a kind of versioning. This project is a server > side implementation of the protocol. So what's the problem, the > incoming feed entries and their updates have to be stored somewhere in > a persistent storage. The easiest approach would be a flat file > storage which is not sufficient in my eyes. I thought about using a Quoting from here: http://wiki.apache.org/general/SimonWillnauer/SummerOfCode2006 "These methods correspond to the UPDATE, RETRIEVE, CREATE and DELETE actions of the service observed by the version-component." In case one adds the "copy" operation, these primitives are precisely what a program versioning system, e.g. svn, supports. So it might be worthwhile to consider svn for the storage base, especially when many duplicates or (almost) duplicates are expected from any particular news feed. This will also depend on the way you intend to support versioning. As an added bonus, you might end up with a system that supports Lucene searching on any svn repository, which would be worthwhile by itself. Any interest on text search on the svn repository(ies) at apache.org ? Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]