Re: [oslc-core] Updated ChangeLog Proposal

Dave Steinberg Fri, 18 Mar 2011 18:25:23 -0400

Hi,

Forgive the interjection. Marcelo Paternostro and I have been lurking on
and discussing this thread. We've implemented an RDF-based system together
and are now tasked with moving it to an OSLC solution. We're looking at the
OSLC specs pragmatically from the consumer's point of view, and hopefully,
more often than not, we'll be able to ask some thought-provoking questions
here. And so, we thought it might be helpful to share our reaction to this
discussion.


To be honest, we've never really understood the general applicability of
the paging technique defined by the Core spec. We do see that providing
data in manageable quantities is an important concern, but it seems to us
that it's not necessarily obvious how to best carve up data in such a way
that it still remains meaningful in any given case. We'd think it should be
incumbent upon anyone who wishes to use this mechanism to carefully
consider how it would be applied to particular RDF models.

So, it's nice to see this discussion of a concrete application of paging.
This case, where information is conveyed by a long, ordered list, certainly
seems to be one that lends itself to being broken up like this. But another
thing we've noticed about this case is that the list itself could be
considered sufficient to convey the paging. Once we've added addressable
identities to the entries in the list, the client could recognize the case
where the object of an rdf:rest is not rdf:nil and is not the subject of
any other statements, and simply do a GET on its URI to obtain the next
"page."

If the client can be counted on to do this, the OSLC paging statements just
disappear from Frank's example. The simplified version looks like this
(note that I've also switched ChangeLog prefix to use a slash):

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  PAGE 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <https://.../ChangeLog/>.

<https://.../ChangeLog> oslc:changes :b1.
:b1 rdf:first
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ;
      oslc:at "103"^^xsd:int
    ] ;
    rdf:rest :b2 .
:b2 rdf:first
    [ a oslc:update ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ;
      oslc:at "102"^^xsd:int
    ] ;
    rdf:rest :b3 .
:b3 rdf:first
    [ a oslc:delete ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ;
      oslc:at "101"^^xsd:int
    ] ;
    rdf:rest :b4 .

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  PAGE 2
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <https://.../ChangeLog/>.

:b4 rdf:first
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/20> ;
      oslc:at "100"^^xsd:int
    ];
    rdf:rest rdf:nil .

Does this approach sound reasonable, or does the paging mechanism add
something here?

Cheers,
Dave

--
Dave Steinberg
IBM Rational Software
[email protected]



                                                                       
  From:       Martin Nally <[email protected]>                          
                                                                       
  To:         Frank Budinsky/Toronto/IBM@IBMCA                         
                                                                       
  Cc:         [email protected], RELM Development 
<relm_development%[email protected]>
                                                                       
  Date:       03/18/2011 02:01 PM                                      
                                                                       
  Subject:    Re: [oslc-core] Updated ChangeLog Proposal               
                                                                       
  Sent by:    [email protected]                      
                                                                       





One more tiny tweak on this. In my example I used @prefix : <
https://.../ChangeLog#>. This causes the lists to have URLs like
https://.../ChangeLog#b1, https://.../ChangeLog#b2 and so on. This is OK,
and it would be the right thing to do for a changelog implementer who did
not want to bother with providing access to individual change entries,
because a GET of https://.../ChangeLog#b1 will actually do a GET of
https://.../ChangeLog, which is the whole changelog. A changelog provider
that wanted to allow clients GET access to individual entries would modify
this to be @prefix : <https://.../ChangeLog/>. This causes the lists to
have URLs like https://.../ChangeLog/b1, https://.../ChangeLog/b2 and so
on.. This provides convenient client access to individual changes (or
rather the list entries that reference them).

Best regards, Martin

Martin Nally, IBM Fellow
CTO and VP, IBM Rational
tel: +1 (714)472-2690



|------------>
| From:      |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Frank Budinsky/Toronto/IBM@IBMCA
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| To:        |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Martin Nally/Raleigh/IBM@IBMUS
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Cc:        |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |[email protected], RELM Development
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Date:      |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |03/18/2011 01:16 PM
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Subject:   |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Re: Updated ChangeLog Proposal
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|





Hi Martin,

Very interesting idea. Instead of pages of "partial lists of change
entries", we have pages of "list entries". I guess it's a little harder for
clients to know when to check for a next page (i.e., a broken reference
ends the pages entries, instead of a nil reference) but otherwise it's much
cleaner.

   > Pagination of this resource according to our standard algorithm is a
   trivial exercise left to the reader.

Here it is, just to make sure I'm capable of doing the trivial exercise :-)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  PAGE 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <https://.../ChangeLog#>.

<https://.../ChangeLog?oslc.paging=true>
  oslc:nextPage <https://.../ChangeLog?pageno=2> .

<https://.../ChangeLog> oslc:changes :b1.
:b1 rdf:first
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ;
      oslc:at "103"^^xsd:int
    ] ;
    rdf:rest :b2 .
:b2 rdf:first
    [ a oslc:update ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ;
      oslc:at "102"^^xsd:int
    ] ;
    rdf:rest :b3 .
:b3 rdf:first
    [ a oslc:delete ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ;
      oslc:at "101"^^xsd:int
    ] ;
    rdf:rest :b4 .

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  PAGE 2
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <https://.../ChangeLog#>.

<https://.../ChangeLog?pageno=2>
  oslc:nextPage rdf:nil .    <<<<<<<<< or omit the nextPage property
entriely

:b4 rdf:first
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/20> ;
      oslc:at "100"^^xsd:int
    ];
    rdf:rest rdf:nil .

Thanks,
Frank.



|------------>
| From:      |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Martin Nally/Raleigh/IBM@IBMUS
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| To:        |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Frank Budinsky/Toronto/IBM@IBMCA
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Cc:        |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |[email protected], RELM Development
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Date:      |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |03/18/2011 11:27 AM
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Subject:   |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Re: Updated ChangeLog Proposal
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|





This may seem like a small, arcane point, but it's been bugging me. It
concerns pagination of the changelog.

OSLC's normal strategy for pagination is very simple and elegant. Every RDF
resource's state is composed of a graph of triples. "Graph" is just RDF's
name for what any middle-school math student would call a set (don't ask,
I've no idea). Sets can be divided arbitrarily into discrete subsets, so if
you want to paginate an RDF resource, all you do is divide up the triples
into discrete subsets and return each subset in the representation of a
page. The beauty is that the triples themselves are totally unchanged. The
Page itself is not the subject of any triples except nextPage (and maybe a
description).

The wrinkle in this approach is caused by blank nodes. Blank nodes cannot
be referenced outside the page they are on, so if any of those triples
includes a blank node as its subject or object, it has to go on the same
page (in the same subset) as every other triple that references the same
blank node. Because of the way RDF lists are represented, the lists
themselves are almost always blank nodes (a list has only two references,
to "first", which is a useful node, and to "rest" which is another list
containing the rest of the conceptual list). Because the blank nodes point
to each other in a chain, they cannot be split across pages, and so the
whole list ends up in the same page.

The way we resolved this in Frank's ChangeLog design was to make each page
of the changelog have a list of change entries. This clearly works, but it
means the triples for the pages are different from the triples for the
changelog itself, which is a more complex and less pleasing pattern.

It occurred to me this morning that there is a way to use our standard
pagination technique to paginate the changelog without introducing a new
mechanism. It works like this.

Normally the changelog whose URL is <https://.../ChangeLog#> looks like
this:

@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .

<https://.../ChangeLog>
  oslc:changes (
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ;
      oslc:at "103"^^xsd:int
    ]
    [ a oslc:update ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ;
      oslc:at "102"^^xsd:int
    ]
    [ a oslc:delete ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ;
      oslc:at "101"^^xsd:int
    ]) .

This is Turtle shorthand for the following. Note that I'm not changing
anything here, I'm just expanding the shorthand:

@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<https://.../ChangeLog> oslc:changes _:b1.
_:b1 rdf:first
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ;
      oslc:at "103"^^xsd:int
    ].
_:b1 rdf:next _:b2
_:b2 rdf:first
    [ a oslc:update ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ;
      oslc:at "102"^^xsd:int
    ].
_:b2 rdf:next _:b3
_:b3 rdf:first
    [ a oslc:delete ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ;
      oslc:at "101"^^xsd:int
    ].
_:b3 rdf:next rdf:nil .


Since I have not changed anything, this still can't be paginated in the
usual manner. However, if I make the following change, it can be:

@prefix oslc:    <http://open-services.net/ns/core#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <https://.../ChangeLog#>.

<https://.../ChangeLog> oslc:changes :b1.
:b1 rdf:first
    [ a oslc:create ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ;
      oslc:at "103"^^xsd:int
    ].
:b1 rdf:rest :b2
:b2 rdf:first
    [ a oslc:update ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ;
      oslc:at "102"^^xsd:int
    ].
:b2 rdf:rest :b3
:b3 rdf:first
    [ a oslc:delete ;
      oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ;
      oslc:at "101"^^xsd:int
    ].
:b3 rdf:rest rdf:nil .

Pagination of this resource according to our standard algorithm is a
trivial exercise left to the reader. The only special rule of pagination
for ChangeLog would be that the entries on a particular page are all older
than the entries on the previous page and younger than the ones on the
following page. This notation is slightly harder for a human to read than
Turtle's () notation, but is neither better nor worse for a computer to
read - it produces the same pattern of triples.

Note that this change causes https://.../ChangeLog#b1 through b3 to become
legitimate URLs. This does not necessarily mean that the OSLC implementer
has to honor a GET on these URLs - it's perfectly OK for the URLs to be
used in the list but not actually be independently GETable - but the
implementation may also choose to honor a GET. This also gives a different
and maybe more satisfactory way for a changelog to make a real resource out
of each change entry. Frank did the obvious thing of allowing the entry
itself to be a resource, but this creates some ambiguities and special
rules, as I pointed out in my last note. If instead it was the list nodes
that are given URLs, and the entries remain always as blank nodes, it might
be easier to understand. In practice, an implementation would be more
likely to use https://.../ChangeLog#b103 through https://.../ChangeLog#b101
as URLs for the lists corresponding to entries of the matching sequence
number.

Best regards, Martin

Martin Nally, IBM Fellow
CTO and VP, IBM Rational
tel: +1 (714)472-2690




|------------>
| From:      |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Frank Budinsky/Toronto/IBM@IBMCA
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| To:        |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |[email protected]
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Cc:        |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Martin Nally/Raleigh/IBM@IBMUS, RELM Development
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Date:      |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |03/17/2011 11:57 AM
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Subject:   |
|------------>

>--------------------------------------------------------------------------------------------------------------------------------------------------|

  |Updated ChangeLog Proposal
|

>--------------------------------------------------------------------------------------------------------------------------------------------------|





Hi All,

I've uploaded the latest version of the OSLC change log proposal here:


http://open-services.net/pub/Main/IndexingProposals/OSLC_indexing_0316.doc

It includes changes to reflect discussion and decisions made during the
first round of review, including the following:

   A section to describe the motivating use case.
   Description of a formal "Indexing Profile" which defines the
   capabilities that a service provider MUST support in order to be
   indexable.
   Proposed formal scope of indexing/changeLog.
   ChangeLog entry timestamps changed to sequence numbers.
   ChangeLog entries can optionally be referenced URI-addressable
   resources.

Please send comments and issues to the mailing list. We're also tentatively
scheduled to discuss this during the OSLC Core Workgroup call, next week,
so hopefully we can hash out most of the remaining issues by then.

Thanks,
Frank.





_______________________________________________
Oslc-Core mailing list
[email protected]
http://open-services.net/mailman/listinfo/oslc-core_open-services.net

Re: [oslc-core] Updated ChangeLog Proposal

Reply via email to