Re: Lucene Gdata -- the best way to store the feeds / entries

Paul Elschot Sun, 28 May 2006 02:01:20 -0700

On Sunday 28 May 2006 01:33, Simon Willnauer wrote:
> For those who haven't heard about the GData project please check
> today's mailing list  .
> The Lucene Indexer is supposed to be used as the search component of
> this implementation. As GData is an extension to the Atom/Rss format
> including search and a kind of versioning. This project is a server
> side implementation of the protocol. So what's the problem, the
> incoming feed entries and their updates have to be stored somewhere in
> a persistent storage. The easiest approach would be a flat file
> storage which is not sufficient in my eyes. I thought about using a


Quoting from here:
http://wiki.apache.org/general/SimonWillnauer/SummerOfCode2006

"These methods correspond to the UPDATE, RETRIEVE, CREATE
and DELETE actions of the service observed by the version-component."

In case one adds the "copy" operation, these primitives are precisely what
a program versioning system, e.g. svn, supports.

So it might be worthwhile to consider svn for the storage base,  especially 
when many duplicates or (almost) duplicates are expected from any
particular news feed.
This will also depend on the way you intend to support versioning.

As an added bonus, you might end up with a system that supports
Lucene searching on any svn repository, which would be worthwhile
by itself.
Any interest on text search on the svn repository(ies)  at apache.org ?

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene Gdata -- the best way to store the feeds / entries

Reply via email to