On 10/2/14 6:19 PM, Jürgen Jakobitsch wrote:
ok - i guess i should come up with an example :what i want to achieve is for example that people can rewrite part of a dataset and be able to get their version of the complete dataset.
Okay.
i.e. (java code)i clone a whole repository, change one single line in one java file and still be able to compile the whole project.i.e. (rdf code) master data (in graph http://graphs.net/master) (a flat list) <http://s.org/a> <http://p.net/label> "europe" . <http://s.org/b> <http://p.net/label> "central europe" . <http://s.org/c> <http://p.net/label> "austria" . <http://s.org/d> <http://p.net/label> "carinthia" . <http://s.org/e> <http://p.net/label> "klagenfurt" . <http://s.org/f> <http://p.net/label> "st.martin" .
Nanotation [1] markers for generating sample data from this post, if required further on in the discussion.
## Nanotation Start ## </document1> <#europe> <#label> "europe" . <#centralEurope> <#label> "central europe" . <#austria> <#label> "austria" . <#carinthia> <#label> "carinthia" . <#klagenfurt> <#label> "klagenfurt" . <#stMarting> <#label> "st.martin" . ## Nanotation End ##
person A (in graph http://graphs.net/persons/a) (= a branch with a hierarchy) (note : person A is at time T1 not an expert and doesn't know about "carinthia" being an austrian state)<http://s.org/a> skos:narrower <http://s.org/b> . <http://s.org/b> skos:narrower <http://s.org/c> . <http://s.org/c> skos:narrower <http://s.org/e> .
## Nanotation Start ## </document2> <#europe> skos:narrower <#uk>. <#centralEurope> skos:narrower <#bulgaria> . <#austria> skos:narrower <#vienna> . ## Nanotation End ##
person B (in graph http://graphs.net/persons/b) (= a branch with a [better] hierarchy) (note : person B is an expert on austrian geography and knows about "carinthia" being an austrian state)<http://s.org/a> skos:narrower <http://s.org/b> . <http://s.org/b> skos:narrower <http://s.org/c> . <http://s.org/c> skos:narrower <http://s.org/d> . <http://s.org/d> skos:narrower <http://s.org/e> .
## Nanotation Start ## ## Vienna and Carinthia conflict </document3> <#austria> skos:narrower <#carinthia> . ## Nanotation End ##
what happend becomes clear when take one step back and realize that all the relations (skos:narrower) have been duplicated.now say person C is a senior expert on the municipalities andboroughs in the city of "klagenfurt". person C agrees with the graph from person B but wants to extend it. in this simple example person => could <= simply add triples in http://graphs.net/persons/c beginning with <http://s.org/e> skos:narrower <http://s.org/f> .and i could select do a SELECT FROM <http://graphs.net/master> FROM <http://graphs.net/persons/b> FROM <http://graphs.net/persons/c> to get complete and happy result.
SELECT *
FROM </document1>
FROM </document2>
FROM </document3>
WHERE { ?s ?p ?o .
VALUES
FILTER (NOT EXISTS {<#austria> skos:narrower <#vienna> } )
}
OR
## Using NOT FROM extension we've implemented
SELECT *
NOT FROM </document2>
WHERE { ?s ?p ?o . }
There are other options.
now, besides copying triples like <http://s.org/a> skos:narrower <http://s.org/b> . <http://s.org/b> skos:narrower <http://s.org/c> . this example works when appending to the end of the hierarchy.what you cannot simply do is for example replace a triple in a branch (graph)
But you can filter out a named graph.Of course there's more, I could even generate live data from the Nanotations embedded in this post, but that's a last resort. I have a like example of triples created via nanotation laced tweets that might demonstrate this shuffling in and out of named graphs used in a SPARQL processing pipeline [2][3][4][5][6][7].
say person D agrees with person B mostly, only "Central Europe" is no political entity and therefor doesn't have to do anything in the hierarchy.person D could actually only copy the graph and adjust the triples accordingly (but that is again copying)now this copying i don't like. let's come back to the initial example of a biological classification.i just triplified the catalogoflife.org <http://catalogoflife.org> downloadable dataset and currently have 1775844 entities and with a couple of different opions froma couple of different scientists this soon goes into billions of triples.;-) i still should think about how express the problem that i see but i need to start somewhere and writing such things down really helps sometimes..wkr j
Hopefully, this illustrates your fundamental quest? Links: [1] http://bit.ly/blog-post-about-nanotation [2] http://linkeddata.uriburner.com/c/9GDYGU3 -- Everything[3] http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this -- all the named graphs contributing to the SPARQL solution behind this page [4] http://linkeddata.uriburner.com/c/9CJLOKIL -- same page with a specific named graph (internal document DB id/name) designated as the data source [5] http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this -- shows the designated named graph data source (hatched in the UI) [6] http://linkeddata.uriburner.com/fct/rdfdesc/usage.vsp?g=https%3A%2F%2Ftwitter.com%2Fhashtag%2FNoSilo%23this -- two named graphs specifically designated as data sources [7] http://linkeddata.uriburner.com/c/9CT5GRUZ -- effect of the two named graphs specifically designated as data sources .
Kingsley
2014-10-02 23:42 GMT+02:00 Kingsley Idehen <[email protected] <mailto:[email protected]>>:On 10/2/14 4:02 PM, Jürgen Jakobitsch wrote:hi, when trying to classify the animals on pictures from a recent trip to eastern indonesia meticulously realized that it is very hard if not impossible to branch datasets with ease. while this might sound ignoreable at first sight it might as well be the reason for the giant global graph to develop a culture of duplicating and linking with the end effect of being very close to where we came from (many sql databases). what i mean will hopefully become clear with a simple example : the "manta birostris" (giant oceanic manta ray) is classified her wikipedia.org <http://wikipedia.org> as Kingdom:Animalia Phylum:Chordata Class:Chondrichthyes Subclass:Elasmobranchii Order:Myliobatiformes Suborder:Myliobatidae Family:Mobulidae Genus:Manta Species:Manta birostris here http://www.catalogueoflife.org/col/browse/tree/id/18879368 as Kingdom: Animalia Phylum: Chordata Class: Elasmobranchii Order: Myliobatiformes Family: Myliobatidae Genus: Manta Species: Manta birostris here http://www.marinespecies.org/aphia.php?p=browser&id=105755#ct as Kingdom: Animalia Phylum: Chordata Subphylum: Vertebrata Superclass: Gnathostomata Superclass Pisces (Unreviewed) Class: Elasmobranchii (Unreviewed) Subclass: Neoselachii (Unreviewed) Infraclass: Batoidea (Unreviewed) Order: Rajiformes Family: Myliobatidae (Unreviewed) Subfamily: Mobulinae Genus: Manta Species: Manta birostris here http://data.gbif.org/species/2419163/ as Kingdom: Animalia Phylum: Chordata Class: Elasmobranchii Order: Myliobatiformes Family: Myliobatidae Genus: Manta Species: Manta birostris if only in theory we would triplify all these datasets and link them it still would be very hard to find out what different people think about the actually same being. now: my thinking was to create a flat list of uris for => all <= these classifications and create branches (graphs) with the hierarchies. but it is not as simple as it sounds because i cannot make the sparql engine follow a branch at certain uris and the rejoin the master graph again by whatever means.You mean that you can't de-reference a SPARQL query pattern variable as part of a SPARQL query processing pipeline?neither can i do such things on data level.If the data is in 5-star Linked Open Data form you have the data network in place. Then its about a SPARQL query that crawls the data-network. Ultimately, each entity description document SHOULD end up being an internal triples/quad store document identifier (a/k/a named graph IRI). Naturally, what I describe above is how Virtuoso will behave is you include input:grab pragmas in your SPARQL.i was thinking about like so [1] on a triple (quad) level. questions: 1. is the problem described so that it is at least semi-understandable (or should i come up with some triples as example)I think so, but not 100% certain :)2. has this problem already been dealt with and i was only missing that day (please provide a link)Sorta, in some other conversations about LOD cloud crawling and SPARQL.3. has this problem already been solved and i was only missing that day (please provide a link) 4. do you think it is worth dealing with (i personally think so [think: scaling cooperation ]) 5. would be a of enough interest to create a wg any pointers and thoughts highly appreciated wkr turnguard-- Regards,Kingsley Idehen Founder & CEO OpenLink Software Company Web:http://www.openlinksw.com Personal Weblog 1:http://kidehen.blogspot.com Personal Weblog 2:http://www.openlinksw.com/blog/~kidehen <http://www.openlinksw.com/blog/%7Ekidehen> Twitter Profile:https://twitter.com/kidehen Google+ Profile:https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile:http://www.linkedin.com/in/kidehen Personal WebID:http://kingsley.idehen.net/dataspace/person/kidehen#this
-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
smime.p7s
Description: S/MIME Cryptographic Signature
