Possibly relevant:
http://www.ietf.org/rfc/rfc5005.txt
Feed paging and archiving for Atom feeds. Paging is a nice solution to
the "small window" problem with syndication feeds. The concept might
be translatable to RSS 1.0.
Although I have to say that I find the idea of pushing RDF updates via
Atom quite appealing.
Richard
On 28 Apr 2009, at 17:01, Yves Raimond wrote:
Hello!
I think the two main options are either to publish a feed containing
pointers to changes, or using a messaging system to push out
notifications.
Despite the recent discussion around benefits of, say, Jabber or
other
mechanisms for pushing out notifications, I think that a more RESTful
approach using RSS or Atom feeds might be nicer. Then we can focus
on the
resource design, i.e. what kinds of changes do we need to publish.
So for example for /programmes it may be sufficient to publish a
set of
feeds for new, e.g. brands, episodes, versions, etc. These could be
RSS 1.0
and then include additional RDF data as appropriate.
My only concern about this is that you need to limit the number of
items in the feed. If you have a sudden burst of activity and the
crawler just ping the feed at regular intervals, it may miss some
updates. However, even with 1M updates in a day, with a feed capped to
100 items would just need the crawlers to ping the feed about every
hour and a half. So that's not too bad.
(Just noticed that Soren's proposal includes pagination of feeds,
which might solve that problem).
So yes, I guess it could be done, using RDF feeds e.g.
http://www.bbc.co.uk/programmes/updates/2009/04/28/brands.rdf etc.
We'd need to carefully think about the feeds we offer though.
Cheers!
y
This has the added advantage that a crawler that only wanted to
collect
certain information, e.g. about brands, could monitor just the
resource(s)
it was interested in. Similarly with careful resource design, the
timing of
updates could also be under the control of the crawler, e.g. new
versions in
last 12 hours, 24 hours, 7 days (avoiding a massive firehose of
updates).
This could be easily done with URIs and avoids having to build that
into the
messaging system.
Interested to know what you think.
Cheers,
L.
2009/4/28 Yves Raimond <[email protected]>
Hello!
I know this issue has been raised during the LOD BOF at WWW 2009,
but
I don't know if any possible solutions emerged from there.
The problem we are facing is that data on BBC Programmes changes
approximately 50 000 times a day (new/updated
broadcasts/versions/programmes/segments etc.). As we'd like to
keep a
set of RDF crawlers up-to-date with our information we were
wondering
how best to ping these. pingthesemanticweb seems like a nice option,
but it needs the crawlers to ping it often enough to make sure they
didn't miss a change. Another solution we were thinking of would
be to
stick either Talis changesets [1] or SPARQL/Update statements in a
message queue, which would then be consumed by the crawlers.
Did anyone tried to tackle this problem already?
Cheers!
y
[1] http://n2.talis.com/wiki/Changeset
Please consider the environment before printing this email.
Find out more about Talis at www.talis.com
shared innovationTM
Any views or personal opinions expressed within this email may not
be
those of Talis Information Ltd or its employees. The content of
this email
message and any files that may be attached are confidential, and
for the
usage of the intended recipient only. If you are not the intended
recipient,
then please return this message to the sender and delete it. Any
use of this
e-mail by an unauthorised recipient is prohibited.
Talis Information Ltd is a member of the Talis Group of companies
and is
registered in England No 3638278 with its registered office at
Knights
Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security
System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
--
Leigh Dodds
Programme Manager, Talis Platform
Talis
[email protected]
http://www.talis.com