Frederick Giasson wrote:
Hi all,
The Web of Linked
Data shouldn't be about mass crawling (search engine style)
etc...
It has to be. How would you answer a query like "all offers for a book
written by a German author" without crawling the relevant data sets?
First question would be: which dataset has this information? Does
amazon has it, or does it needs to be linked to other people dataset
where you can find such information? (which brings all the question of
disambiguation of entities, etc...)
Fred,
Disambiguation is handled in our case via the faceted search engine
component of Virtuoso.
In any case, there are multiple ways to endup with more or less the
same result. Tell me if I am right, but I think that the current set
of related cartridges only get data from a book URL? So, it is just
converting data about a particular book, for a given URL, using some
API (amazon in this case).
Even if you start with an Amazon URL you will not only have pathways to
an Amazon data space hosted graph. You will have interesting pathways to
O'Reilly, eBay, and many other places. Naturally, we also have the LOD
Cloud Cache, Sindice, and other data spaces that play various parts in
the processing pipeline.
What about search URLs, using search APIs from the same services?
Yahoo! Bing! Google (even), and others are all part of the cocktail of
services for which we've developed lookup and inference rules driven
Meta Cartridges.
I can certainly think about a cartridge that does just this: searching
for items, and returning the resultsets in RDF using some ontologies.
And then you use the current cartridge to get all the information
about the items you care about in the resultset.
Yes, of course, and doing lookups against the LOD Cloud Cache and other
sources.
One thing is sure is that the expressiveness of your queries is bound
to the expressiveness of the search API you query. So this is not the
answer to all problems.
Yes, query expressiveness is vital, hence my references to SPARQL and
OWL in my prior response. Just add inference rules to that when
thinking about Meta Cartridges.
Basically, we are packing the smart technology behind the proxy/wrapper
URIs.
But one question: is it realists to think that anyone could query all
amazon and ebay sites (US, CAN, and all the other countries) to
convert everything? And if it endups being the case, how synching and
maintenance could take place?
Exactly! And why should anyone really want to do this? It is possible to
walk the Web is a very smart way, like a Stingray in a sense, and the
Sponger combined with Virtuoso engine innards allows us to see the Web
as a Federation of Data Spaces.
And even if some glutton of a service pulled this off, what about the
Context Halo which encompasses all data access and integration
endeavors, including: "change sensitivity" ? Example (based on Locale
variety): how do you deal with the following within the context of a
query, when the person seeks: all Books by a German Author, who is
actually German, and has a preference for books associated with specific
Subject Matter, with price preference in local currency?
It really depends on the usecases, but there are much that can be done
by leveraging all APIs in systems such as the Virtuoso sponger. I
think that what you are talking about here will only happen when these
services will want it to happen.
In our case, what I describe is something we do want to happen re.
Sponger based Linked Data graphs :-)
Kingsley
Thanks,
Take care,
Fred
--
Regards,
Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com