Hey Paul, while I was writing the proposal for the project I was
considering some CVS/SVN component for the gdata server which is
predestinated for doing such a job. I will startup with the simplest
approach. I will store the entries inside the lucene index as yonik
and chuck have described. Later in development I will think about
integrating svn as a versioning component, I might be worth a
discussion to split the svn search component as another contrib
project.

Good stuff, yonik you where right!! Hey any Atom / RSS professionals
out there giving me a hand with some feed specific questions.

simon

On 5/28/06, Paul Elschot <[EMAIL PROTECTED]> wrote:
On Sunday 28 May 2006 01:33, Simon Willnauer wrote:
> For those who haven't heard about the GData project please check
> today's mailing list  .
> The Lucene Indexer is supposed to be used as the search component of
> this implementation. As GData is an extension to the Atom/Rss format
> including search and a kind of versioning. This project is a server
> side implementation of the protocol. So what's the problem, the
> incoming feed entries and their updates have to be stored somewhere in
> a persistent storage. The easiest approach would be a flat file
> storage which is not sufficient in my eyes. I thought about using a

Quoting from here:
http://wiki.apache.org/general/SimonWillnauer/SummerOfCode2006

"These methods correspond to the UPDATE, RETRIEVE, CREATE
and DELETE actions of the service observed by the version-component."

In case one adds the "copy" operation, these primitives are precisely what
a program versioning system, e.g. svn, supports.

So it might be worthwhile to consider svn for the storage base,  especially
when many duplicates or (almost) duplicates are expected from any
particular news feed.
This will also depend on the way you intend to support versioning.

As an added bonus, you might end up with a system that supports
Lucene searching on any svn repository, which would be worthwhile
by itself.
Any interest on text search on the svn repository(ies)  at apache.org ?

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to