Yves Raimond wrote:
Hello!

I know this issue has been raised during the LOD BOF at WWW 2009, but
I don't know if any possible solutions emerged from there.

The problem we are facing is that data on BBC Programmes changes
approximately 50 000 times a day (new/updated
broadcasts/versions/programmes/segments etc.). As we'd like to keep a
set of RDF crawlers up-to-date with our information we were wondering
how best to ping these. pingthesemanticweb seems like a nice option,
but it needs the crawlers to ping it often enough to make sure they
didn't miss a change.

What's wrong with that ? :-)

If PTSW works then consumers should just ping it based on their solution change sensitivity thresholds.

Another solution we were thinking of would be to
stick either Talis changesets [1] or SPARQL/Update statements in a
message queue, which would then be consumed by the crawlers.

An addition option if for the HTML information resources to be crawled as per usual with RDF aware crawlers using RDF discovery patterns to locate RDF information resource represenations via <link/> .


Kingsley

Did anyone tried to tackle this problem already?

Cheers!
y


[1] http://n2.talis.com/wiki/Changeset




--


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO OpenLink Software Web: http://www.openlinksw.com





Reply via email to