Re: How to avoid that collections "break" relationships

Gregg Kellogg Tue, 25 Mar 2014 10:46:55 -0700

Hi Peter,

On Mar 25, 2014, at 9:49 AM, Peter F. Patel-Schneider <pfpschnei...@gmail.com> 
wrote:

> Let's see if I have this right.
> 
> You are encountering a situation where thenumber of people Markus knows is 
> too big (somehow).  The proposed solution is to move this information to a 
> separate location. I don't see how this helps in reducing the size of the 
> information, which was the initial problem.

From my perspective, this is really a clash between the notions of the use of 
URIs in RDF to denote entities, and relative URIs in many REST applications to 
denote relationships. In my experience, a RESTful web application may use a URI 
relative to an entity's location as a way to access related entities; this is a 
common pattern in Ruby on Rails. For example:

http://example/users/1

In many systems, this would be served by a controller where _1_ is taken to be 
a primary key for a related SQL table, in this case a Users table. If users are 
joined together using a many-to-many relationship, a convention I can use in my 
application is to construct a "route", such as the following:

http://example/users/1/knows/

Which might be semantically equivalent (within the application logic) to 
http://example/knows?user_id=1. The controller may then query the join table 
where one column (say src_id) is _1_, so that results find related entities 
based on another column in the join table (say dest_id). The application may 
then return all records in a single request, or a subset of those records 
through pagination.

Many developers will want to be able to publish information about their 
datasets using a vocabulary such as schema.org. Given that an entity may 
contain many relationships, it is not feasiable to create a single entity 
description with all of the members of these relationships enumerated. For 
example, a User entity may have parents, children, friends (knows), likes, 
comments, photos, ... Moreover, these relationships are bi-directional (a user 
asserts a knows relationship with another user, and is known by other users). 
In a prototypical Rails application, this works because a page rendered for a 
user contains controls to access these relationships. How does the developer of 
such an application capture these semantics using something like schema.org? As 
it stands in Hydra now, these relationships might be described as follows:

<.../markus/> a schema:Person;
  schema:knows <.../markus/knows>;
  ...

However, as markus points out, the <../markus/knows> resource likely returns a 
collection, rather than a person. This isn't a show-stopper for schema.org, 
because schema:knows does not use rdfs:range, but schema:rangeIncludes, which 
does not cause an inference that <.../markus/knows> is a schema:Person, but the 
same logic should work for something such as FOAF, where it would create such a 
contradiction.

The challenge for a developer is to come up with entity markup that has a good 
chance of being understood for SEO purposes, and does not create so high a 
conceptual barrier for the developer that they just don't attempt it. I think 
it is our responsibility to provide best practices for marking up entities used 
in such applications in a simple way that does not clash with RDF expectations, 
where any URI used in the range of schema:knows is expected to be a person and 
not a collection.

> Splitting this information into pieces might help. schema.org, along with 
> just about every other RDF syntax, doesnot require that all the information 
> about a particular entity is in the same spot. The problem then is to ensure 
> that all the information is accessed together.
> 
> schema.org, somewhat separate from other RDF syntaxes, does have facilities 
> for this.  All you need to do is to set up multiple pages, for example
> .../markus1 through.../markusn
> and on each of these pages include schema.org markup withcontent like
> <.../markusi> schema:url <.../markus>
> <.../markus> schema:knows <.../friendi1>
> ...
> <.../markus> schema:knows <.../friendimi>
> Then on .../markus you have
> <.../markus> schema:url <.../markus1>
> ...
> <.../markus> schema:url <.../markusn>
> (Maybe schema:sameAs is a better relationshipto use here, but they both 
> should work.)
> 
> Voila! (With the big provisio that I have no idea whether the schema.org 
> processors actually dothe right thing here, asthere is no indication of what 
> they do do.)

The problem is, that if this is to drive application logic, as is the intent of 
Hydra, how to know what URI to dereference if you're interested in 
schema:knows, or schema:children, or schema:parent, or schema:comment, or 
whatever the interesting relationship is?

I think there are two ways out of this:

1) schema.org can break the relationship expectation model by specifically 
allowing, say, an ItemList to be the value of any property with the intent that 
it provide such an indirection, and damn the RDF consequences.
2) use something like an operation, that describes these relationships, but has 
less of a chance of being useful for SEO. For example:

<../markus/> a foaf:Person
 hydra:supportedOperation [
   a GetRelatedCollectionOperation;
   hydra:title "Get known relations";
   hydra:description "Retrieves a collection of foaf:Person related to the 
subject through foaf:knows";
   hydra:property foaf:knows;
   hydra:uri <../markus/knows>;
   hydra:method "GET";
   hydra:returns foaf:Person
 ] .

Gregg

> peter
> 
> PS:  LDP??
> 
> On 03/24/2014 08:24 AM, Markus Lanthaler wrote:
>> Hi all,
>> 
>> We have an interesting discussion in the Hydra W3C Community Group [1]
>> regarding collections and would like to hear more opinions and ideas. I'm
>> sure this is an issue a lot of Linked Data applications face in practice.
>> 
>> Let's assume we want to build a Web API that exposes information about
>> persons and their friends. Using schema.org, your data would look somewhat
>> like this:
>> 
>>   </markus> a schema:Person ;
>>             schema:knows </alice> ;
>>             ...
>>             schema:knows </zorro> .
>> 
>> All this information would be available in the document at /markus (please
>> let's not talk about hash URLs etc. here, ok?). Depending on the number of
>> friends, the document however may grow too large. Web APIs typically solve
>> that by introducing an intermediary (paged) resource such as
>> /markus/friends/. In Schema.org we have ItemList to do so:
>> 
>>   </markus> a schema:Person ;
>>             schema:knows </markus/friends/> .
>> 
>>   </markus/friends/> a schema:ItemList ;
>>             schema:itemListElement </alice> ;
>>             ...
>>             schema: itemListElement </zorro> .
>> 
>> This works, but has two problems:
>>   1) it breaks the /markus --[knows]--> /alice relationship
>>   2) it says that /markus --[knows]--> /markus/friends
>> 
>> While 1) can easily be fixed, 2) is much trickier--especially if we consider
>> cases that don't use schema.org with its "weak semantics" but a vocabulary
>> that uses rdfs:range, such as FOAF. In that case, the statement
>> 
>>   </markus> foaf:knows </markus/friends/> .
>> 
>> and the fact that
>> 
>>   foaf:knows rdfs:range foaf:Person .
>> 
>> would yield to the "wrong" inference that /markus/friends is a foaf:Person.
>> 
>> How do you deal with such cases?
>> 
>> How is schema.org intended to be used in cases like these? Is the above use
>> of ItemList sensible or is this something that should better be avoided?
>> 
>> 
>> Thanks,
>> Markus
>> 
>> 
>> P.S.: I'm aware of how LDP handles this issue, but, while I generally like
>> the approach it takes, I don't like that fact that it imposes a specific
>> interaction model.
>> 
>> 
>> [1] http://bit.ly/HydraCG
>> 
>> 
>> 
>> --
>> Markus Lanthaler
>> @markuslanthaler
>> 
>> 
>> 
>> 
>> 
> 
>

Re: How to avoid that collections "break" relationships

Reply via email to