Hi Jakob, Yes, sorry, I was talking about "subject" (not object). I guess we are not having a talk about the web of data, but about Marmotta's LDClient ? Because in my understanding of the web of data, anyone can say anything about anything, isn't that correct ? For instance, there could be a specific product sold by different vendors, and each vendor, publishing a catalog in RDF, will provide a price for that product. So, referring to that produc'ts URI, each vendor will publish data with the product's URI as subject. That seems a very simple and realistic case, isn't it ? In the example you give "What I interpret from your message below is that you would like to include triples like "dbpedia:Europe a dbpedia:Continent" retrieved from a URL like http://example.com/foo - is this correct?" Yes about "data", no about "data publishing". What I mean is that we could imagine that some people dealing with tourism could have a class that is a "TouristProvenance", and it seems totally normal to me if they publish data saying: "dbpedia:Europe a touristOnto:TouristProvenance", isn't it possible in their data ? But then, the difference will be about how to publish that data: of course they would not publish that triple as "linked data" when derefencing "http://example.com/foo", but they could provide an ontology in a RDF file, or a SPARQL end-point containing such triples. If I am wrong, thank you for the pointers, maybe I missed something and should correct my way of thinking. Then can you help me to better understand what "is/should" a Marmotta's LDClient ? On that page [1], it is said that "LDClient is a flexible and modular Linked Data Client (RDFizer ( http://www.w3.org/wiki/ConverterToRdf) )" There is already something not clear for me in that sentence: RDFizing is, to me, the process of transforming non-rdf data to rdf. But if I understand it well, LDClient is already able to import natif RDF, for instance RDFa, Linked Data and also querying a SPARQL end-point. Is LDClient designed to deal only with data published from its own URL, where all triples have that URL as subject ? if so, what happens when LDClient is used as a RDFizer on non-RDF data ? Maybe I should have a look at the RDFa client and also see how data is processed there. But here is what interest us in LDClient: - import RDF and non-RDF data in the triple store (even if it is an RDF file where subject don't correspond to the file's URL) - import first in a temporary location, in order to import only part of the data, and validate the data. It seems that LDClient does handle this natively and this feature is very interesting for us. About dealing with data update, I understand that in your use of LDCache/LDClient, ensuring that triples with a specific subject come from one data source is a way to know which triples to update when refreshing the data. In our case, we deal with 'contexts' (named graph) to deal with that. Talking about this, I do have another question: is it a problem for Marmotta/Kiwi do deal with a certain quantity of contexts ? I know it is not a problem with other triple stores as OWLIM for instance. Thank you Fabian [1] http://marmotta.apache.org/ldclient/
>>> Le 04.11.2014 à 09:45, Jakob Frank <[email protected]> a écrit dans le message <[email protected]> : Hi Fabian, are you sure you're not mixing up subject and object in your message? Because LDClient will de-reference, e.g. http://dbpedia.org/resource/Europe and add all triples with dbpedia:Europe as *subject* to the repository. Any other URI, e.g http://example.com/foo will be dereferenced and a triple like "<http://example.com/foo> dct:about dbpedia:Europe" will be added to the repository. What I interpret from your message below is that you would like to include triples like "dbpedia:Europe a dbpedia:Continent" retrieved from a URL like http://example.com/foo - is this correct? This introduces a big problem: provenance. How do you guarantee that the data from http://example.com/foo about dbpedia:Europe is correct? That's why triples with a different subject are ignored in LDClient. Best, Jakob On 2014-11-04 09:05, Fabian Cretton wrote: > Hello Sergio, > > In this current discussion, shouldn't we do a difference between > the linked data principles [1] (and thus the RDF graph), and how data > are published (rdf file, linked data with content negociation, sparql > end-point, RDFa, etc.) ? > > About linked data principles, tell me if I am wrong, but here is what I > understand: the goal of the first point "Use URIs as names for things" > is to have international keys to identify things, and thus avoid data > silos as in relational databases. The second point "Use HTTP URIs so > that people can look up those names. " says that the URIS should be > accessible through HTTP (e.g. URL), and so they can be dereferenced in > order to get SOME data about that thing (point 3 - "When someone looks > up a URI, provide useful information, using the standards (RDF*, SPARQL) > "). Than, this data can link to other data as stated in point 4 "Include > links to other URIs. so that they can discover more things. " > > But does the linked data principles say that triples with a specif > object should only be served (data publishing) on that specific URI ? It > is not my understanding so far, and thats why I did write "SOME" > information here above. > For instance, anyone could write triples about > <http://dbpedia.org/resource/Europe>, in any given domain (art, politic, > etc.), using any available ontology, no ? > So triples with <http://dbpedia.org/resource/Europe> as object could > come from any source other than derefencing the > "http://dbpedia.org/resource/Europe" URL. > And as an example, this file > "http://www.w3.org/People/Berners-Lee/card.rdf" does contain triples > with different resources as objects. > > Replacing this in the overLOD context: its goal is to provide tools to > build an application based on distributed data, here using the Web of > Data technologies. Different data providers do provide data in different > forms (data publishing). It could be rdf files, sparql end-points, or > even data that needs to be RDFized (microdata for instance). > Then overLOD allows to reference those data, import them (entirely or > partly, for instance we usually don't need all languages of the labels > provided by a geoname feature), control them (as data could be wrong, > and inferencing is not easily a way to control data). Then data is at > disposal for apps build on that instance of overLOD (i.e. with the > decisions we took, it is an instance of Marmotta). > > And thus, overLOD does bring something different from LDCache, a way to > better "control" which data is in the store, how it is updated, which > seems to me mandatory when building a real app. > > We won't have time in the overLOD project to build a fully functional > tool, but the basics will be there. > > I am not sure this discussion is of any interest for you, but thanks for > your thoughts > Fabian > > > > > > Hi, > > On 01/11/14 13:14, Fabian Cretton wrote: >>>> Then, I did implement LDClients that can import RDF files (instead of >>>> using the import service). They are just like the "linked data" code, >>>> except I don't check if the subject of the triple correspond to the >>>> URI. >> >> Of course we don't expect that the code we write for OverLOD will be > appreciated by the Marmotta Team, >> but we will just let people know it is there if needed :-) >> >> But actually I don't understand your point here about RDF files moving > away from Linked Data paradigm. >> Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which all > have LDClients, follow linked data paradigm more than >> http://sws.geonames.org/2921044/about.rdf > > No no, I'm not saying that. Let me try to explain it: > > If we take the Linked Data principles [1], ee could say that LDClient > extends the 3rd point ("when someone looks up a URI, provide useful > information") beyond just "using the standards (RDF*, SPARQL)" by > providing new methods to get RDF data out of other formats. > > But LDClient does not modify the 1st principle ("use URIs as names for > things"). And that's what I referred to because the sentence "They are > just like the "linked data" code, except I don't check if the subject of > the triple correspond to the URI". > > Maybe I got it wrong, and what you actually do is extend the 4th > principle ("Include links to other URIs. so that they can discover more > things"), which is of course interesting. Just needed to be explained. > > BTW, hope you have in mind that if OverLOD produces new LDClient data > providers that can be useful for a broader community, please propose > them to be included in the main project. > > Cheers, > > [1] http://www.w3.org/DesignIssues/LinkedData.html > > P.S.: please, configure you client to use the "Re:" prefix when replying > to public English mailing lists > > -- > Sergio Fernández > Partner Technology Manager > Redlink GmbH > m: +43 660 2747 925 > e: [email protected] > w: http://redlink.co > [1] http://www.w3.org/DesignIssues/LinkedData.html > -- DI Jakob Frank Knowledge and Media Technologies Salzburg Research Forschungsgesellschaft mbH Jakob Haringer-Strasse 5/3 | 5020 Salzburg, Austria T: +43.662.2288-419 | F: +43.662.2288-222 [email protected] http://www.salzburgresearch.at http://at.linkedin.com/in/jakobfrank
