Re: Re: Re : Re: LDClient

Fabian Cretton Tue, 04 Nov 2014 02:51:13 -0800

Hi Jakob,
 
Yes, sorry, I was talking about "subject" (not object).
 
I guess we are not having a talk about the web of data, but about
Marmotta's LDClient ?
 
Because in my understanding of the web of data, anyone can say anything
about anything, isn't that correct ?
For instance, there could be a specific product sold by different
vendors, and each vendor, publishing a catalog in RDF, will provide a
price for that product. So, referring to that produc'ts URI, each vendor
will publish data with the product's URI as subject. That seems a very
simple and realistic case, isn't it ?
 
In the example you give "What I interpret from your message below is
that you would like to include triples like "dbpedia:Europe a
dbpedia:Continent" retrieved from a URL like http://example.com/foo - is
this correct?" 
Yes about "data", no about "data publishing". What I mean is that we
could imagine that some people dealing with tourism could have a class
that is a "TouristProvenance", and it seems totally normal to me if they
publish data saying: "dbpedia:Europe a touristOnto:TouristProvenance",
isn't it possible in their data ?
But then, the difference will be about how to publish that data: of
course they would not publish that triple as "linked data" when
derefencing "http://example.com/foo";, but they could provide an ontology
in a RDF file, or a SPARQL end-point containing such triples.
 
If I am wrong, thank you for the pointers, maybe I missed something and
should correct my way of thinking.
 
Then can you help me to better understand what "is/should" a Marmotta's
LDClient ?
On that page [1], it is said that "LDClient is a flexible and modular
Linked Data Client (RDFizer
( http://www.w3.org/wiki/ConverterToRdf) )"
There is already something not clear for me in that sentence: RDFizing
is, to me, the process of transforming non-rdf data to rdf.
But if I understand it well, LDClient is already able to import natif
RDF, for instance RDFa, Linked Data and also querying a SPARQL
end-point.
 
Is LDClient designed to deal only with data published from its own URL,
where all triples have that URL as subject ?
if so, what happens when LDClient is used as a RDFizer on non-RDF data
?
 
Maybe I should have a look at the RDFa client and also see how data is
processed there.
 
But here is what interest us in LDClient:
- import RDF and non-RDF data in the triple store (even if it is an RDF
file where subject don't correspond to the file's URL)
- import first in a temporary location, in order to import only part of
the data, and validate the data. It seems that LDClient does handle this
natively and this feature is very interesting for us.
 
About dealing with data update, I understand that in your use of
LDCache/LDClient, ensuring that triples with a specific subject come
from one data source is a way to know which triples to update when
refreshing the data. In our case, we deal with 'contexts' (named graph)
to deal with that. 
Talking about this, I do have another question: is it a problem for
Marmotta/Kiwi do deal with a certain quantity of contexts ? I know it is
not a problem with other triple stores as OWLIM for instance.
 
Thank you
Fabian
 
[1] http://marmotta.apache.org/ldclient/


>>> Le 04.11.2014 à  09:45, Jakob Frank
<[email protected]> a écrit dans le message
<[email protected]> :

Hi Fabian,

are you sure you're not mixing up subject and object in your message?

Because LDClient will de-reference, e.g.
http://dbpedia.org/resource/Europe and add all triples with
dbpedia:Europe as *subject* to the repository.

Any other URI, e.g http://example.com/foo will be dereferenced and a
triple like "<http://example.com/foo> dct:about dbpedia:Europe" will
be
added to the repository.


What I interpret from your message below is that you would like to
include triples like "dbpedia:Europe a dbpedia:Continent" retrieved
from
a URL like http://example.com/foo - is this correct?

This introduces a big problem: provenance. How do you guarantee that
the
data from http://example.com/foo about dbpedia:Europe is correct?
That's
why triples with a different subject are ignored in LDClient.


Best,
Jakob

On 2014-11-04 09:05, Fabian Cretton wrote:
> Hello Sergio,
>  
> In this current discussion, shouldn't we do a difference between
> the linked data principles [1] (and thus the RDF graph), and how
data
> are published (rdf file, linked data with content negociation,
sparql
> end-point, RDFa, etc.) ?
>  
> About linked data principles, tell me if I am wrong, but here is what
I
> understand: the goal of the first point "Use URIs as names for
things"
> is to have international keys to identify things, and thus avoid
data
> silos as in relational databases. The second point "Use HTTP URIs so
> that people can look up those names. " says that the URIS should be
> accessible through HTTP (e.g. URL), and so they can be dereferenced
in
> order to get SOME data about that thing (point 3 - "When someone
looks
> up a URI, provide useful information, using the standards (RDF*,
SPARQL)
> "). Than, this data can link to other data as stated in point 4
"Include
> links to other URIs. so that they can discover more things. "
>  
> But does the linked data principles say that triples with a specif
> object should only be served (data publishing) on that specific URI ?
It
> is not my understanding so far, and thats why I did write "SOME"
> information here above.
> For instance, anyone could write triples about
> <http://dbpedia.org/resource/Europe>, in any given domain (art,
politic,
> etc.), using any available ontology, no ?
> So triples with <http://dbpedia.org/resource/Europe> as object could
> come from any source other than derefencing the
> "http://dbpedia.org/resource/Europe"; URL. 
> And as an example, this file
> "http://www.w3.org/People/Berners-Lee/card.rdf"; does contain triples
> with different resources as objects.
>  
> Replacing this in the overLOD context: its goal is to provide tools
to
> build an application based on distributed data, here using the Web
of
> Data technologies. Different data providers do provide data in
different
> forms (data publishing). It could be rdf files, sparql end-points,
or
> even data that needs to be RDFized (microdata for instance).
> Then overLOD allows to reference those data, import them (entirely
or
> partly, for instance we usually don't need all languages of the
labels
> provided by a geoname feature), control them (as data could be
wrong,
> and inferencing is not easily a way to control data). Then data is
at
> disposal for apps build on that instance of overLOD (i.e. with the
> decisions we took, it is an instance of Marmotta).
>  
> And thus, overLOD does bring something different from LDCache, a way
to
> better "control" which data is in the store, how it is updated,
which
> seems to me mandatory when building a real app.
>  
> We won't have time in the overLOD project to build a fully
functional
> tool, but the basics will be there.
>  
> I am not sure this discussion is of any interest for you, but thanks
for
> your thoughts
> Fabian
>  
>  
>  
>  
>  
> Hi,
> 
> On 01/11/14 13:14, Fabian Cretton wrote:
>>>> Then, I did implement LDClients that can import RDF files (instead
of
>>>> using the import service). They are just like the "linked data"
code,
>>>> except I don't check if the subject of the triple correspond to
the
>>>> URI.
>>
>> Of course we don't expect that the code we write for OverLOD will
be
> appreciated by the Marmotta Team,
>> but we will just let people know it is there if needed :-)
>>
>> But actually I don't understand your point here about RDF files
moving
> away from Linked Data paradigm.
>> Do you mean that Youtube, Vimeo, RDFa and SPARQL endpoints, which
all
> have LDClients, follow linked data paradigm more than
>> http://sws.geonames.org/2921044/about.rdf
> 
> No no, I'm not saying that. Let me try to explain it:
> 
> If we take the Linked Data principles [1], ee could say that
LDClient
> extends the 3rd point ("when someone looks up a URI, provide useful
> information") beyond just "using the standards (RDF*, SPARQL)" by
> providing new methods to get RDF data out of other formats.
> 
> But LDClient does not modify the 1st principle ("use URIs as names
for
> things"). And that's what I referred to because the sentence "They
are
> just like the "linked data" code, except I don't check if the subject
of
> the triple correspond to the URI".
> 
> Maybe I got it wrong, and what you actually do is extend the 4th
> principle ("Include links to other URIs. so that they can discover
more
> things"), which is of course interesting. Just needed to be
explained.
> 
> BTW, hope you have in mind that if OverLOD produces new LDClient
data
> providers that can be useful for a broader community, please propose
> them to be included in the main project.
> 
> Cheers,
> 
> [1] http://www.w3.org/DesignIssues/LinkedData.html
> 
> P.S.: please, configure you client to use the "Re:" prefix when
replying
> to public English mailing lists
> 
> -- 
> Sergio Fernández
> Partner Technology Manager
> Redlink GmbH
> m: +43 660 2747 925
> e: [email protected]
> w: http://redlink.co
> [1] http://www.w3.org/DesignIssues/LinkedData.html
> 

-- 
DI Jakob Frank
Knowledge and Media Technologies

Salzburg Research Forschungsgesellschaft mbH
Jakob Haringer-Strasse 5/3 | 5020 Salzburg, Austria
T: +43.662.2288-419 | F: +43.662.2288-222
[email protected]
http://www.salzburgresearch.at
http://at.linkedin.com/in/jakobfrank

Re: Re: Re : Re: LDClient

Reply via email to