Hi, Forgive the interjection. Marcelo Paternostro and I have been lurking on and discussing this thread. We've implemented an RDF-based system together and are now tasked with moving it to an OSLC solution. We're looking at the OSLC specs pragmatically from the consumer's point of view, and hopefully, more often than not, we'll be able to ask some thought-provoking questions here. And so, we thought it might be helpful to share our reaction to this discussion.
To be honest, we've never really understood the general applicability of the paging technique defined by the Core spec. We do see that providing data in manageable quantities is an important concern, but it seems to us that it's not necessarily obvious how to best carve up data in such a way that it still remains meaningful in any given case. We'd think it should be incumbent upon anyone who wishes to use this mechanism to carefully consider how it would be applied to particular RDF models. So, it's nice to see this discussion of a concrete application of paging. This case, where information is conveyed by a long, ordered list, certainly seems to be one that lends itself to being broken up like this. But another thing we've noticed about this case is that the list itself could be considered sufficient to convey the paging. Once we've added addressable identities to the entries in the list, the client could recognize the case where the object of an rdf:rest is not rdf:nil and is not the subject of any other statements, and simply do a GET on its URI to obtain the next "page." If the client can be counted on to do this, the OSLC paging statements just disappear from Frank's example. The simplified version looks like this (note that I've also switched ChangeLog prefix to use a slash): >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PAGE 1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <https://.../ChangeLog/>. <https://.../ChangeLog> oslc:changes :b1. :b1 rdf:first [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ; oslc:at "103"^^xsd:int ] ; rdf:rest :b2 . :b2 rdf:first [ a oslc:update ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ; oslc:at "102"^^xsd:int ] ; rdf:rest :b3 . :b3 rdf:first [ a oslc:delete ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ; oslc:at "101"^^xsd:int ] ; rdf:rest :b4 . >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PAGE 2 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <https://.../ChangeLog/>. :b4 rdf:first [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/20> ; oslc:at "100"^^xsd:int ]; rdf:rest rdf:nil . Does this approach sound reasonable, or does the paging mechanism add something here? Cheers, Dave -- Dave Steinberg IBM Rational Software [email protected] From: Martin Nally <[email protected]> To: Frank Budinsky/Toronto/IBM@IBMCA Cc: [email protected], RELM Development <relm_development%[email protected]> Date: 03/18/2011 02:01 PM Subject: Re: [oslc-core] Updated ChangeLog Proposal Sent by: [email protected] One more tiny tweak on this. In my example I used @prefix : < https://.../ChangeLog#>. This causes the lists to have URLs like https://.../ChangeLog#b1, https://.../ChangeLog#b2 and so on. This is OK, and it would be the right thing to do for a changelog implementer who did not want to bother with providing access to individual change entries, because a GET of https://.../ChangeLog#b1 will actually do a GET of https://.../ChangeLog, which is the whole changelog. A changelog provider that wanted to allow clients GET access to individual entries would modify this to be @prefix : <https://.../ChangeLog/>. This causes the lists to have URLs like https://.../ChangeLog/b1, https://.../ChangeLog/b2 and so on.. This provides convenient client access to individual changes (or rather the list entries that reference them). Best regards, Martin Martin Nally, IBM Fellow CTO and VP, IBM Rational tel: +1 (714)472-2690 |------------> | From: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Frank Budinsky/Toronto/IBM@IBMCA | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | To: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Martin Nally/Raleigh/IBM@IBMUS | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Cc: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |[email protected], RELM Development | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Date: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |03/18/2011 01:16 PM | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Subject: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Re: Updated ChangeLog Proposal | >--------------------------------------------------------------------------------------------------------------------------------------------------| Hi Martin, Very interesting idea. Instead of pages of "partial lists of change entries", we have pages of "list entries". I guess it's a little harder for clients to know when to check for a next page (i.e., a broken reference ends the pages entries, instead of a nil reference) but otherwise it's much cleaner. > Pagination of this resource according to our standard algorithm is a trivial exercise left to the reader. Here it is, just to make sure I'm capable of doing the trivial exercise :-) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PAGE 1 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <https://.../ChangeLog#>. <https://.../ChangeLog?oslc.paging=true> oslc:nextPage <https://.../ChangeLog?pageno=2> . <https://.../ChangeLog> oslc:changes :b1. :b1 rdf:first [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ; oslc:at "103"^^xsd:int ] ; rdf:rest :b2 . :b2 rdf:first [ a oslc:update ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ; oslc:at "102"^^xsd:int ] ; rdf:rest :b3 . :b3 rdf:first [ a oslc:delete ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ; oslc:at "101"^^xsd:int ] ; rdf:rest :b4 . >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> PAGE 2 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <https://.../ChangeLog#>. <https://.../ChangeLog?pageno=2> oslc:nextPage rdf:nil . <<<<<<<<< or omit the nextPage property entriely :b4 rdf:first [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/20> ; oslc:at "100"^^xsd:int ]; rdf:rest rdf:nil . Thanks, Frank. |------------> | From: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Martin Nally/Raleigh/IBM@IBMUS | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | To: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Frank Budinsky/Toronto/IBM@IBMCA | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Cc: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |[email protected], RELM Development | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Date: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |03/18/2011 11:27 AM | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Subject: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Re: Updated ChangeLog Proposal | >--------------------------------------------------------------------------------------------------------------------------------------------------| This may seem like a small, arcane point, but it's been bugging me. It concerns pagination of the changelog. OSLC's normal strategy for pagination is very simple and elegant. Every RDF resource's state is composed of a graph of triples. "Graph" is just RDF's name for what any middle-school math student would call a set (don't ask, I've no idea). Sets can be divided arbitrarily into discrete subsets, so if you want to paginate an RDF resource, all you do is divide up the triples into discrete subsets and return each subset in the representation of a page. The beauty is that the triples themselves are totally unchanged. The Page itself is not the subject of any triples except nextPage (and maybe a description). The wrinkle in this approach is caused by blank nodes. Blank nodes cannot be referenced outside the page they are on, so if any of those triples includes a blank node as its subject or object, it has to go on the same page (in the same subset) as every other triple that references the same blank node. Because of the way RDF lists are represented, the lists themselves are almost always blank nodes (a list has only two references, to "first", which is a useful node, and to "rest" which is another list containing the rest of the conceptual list). Because the blank nodes point to each other in a chain, they cannot be split across pages, and so the whole list ends up in the same page. The way we resolved this in Frank's ChangeLog design was to make each page of the changelog have a list of change entries. This clearly works, but it means the triples for the pages are different from the triples for the changelog itself, which is a more complex and less pleasing pattern. It occurred to me this morning that there is a way to use our standard pagination technique to paginate the changelog without introducing a new mechanism. It works like this. Normally the changelog whose URL is <https://.../ChangeLog#> looks like this: @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <https://.../ChangeLog> oslc:changes ( [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ; oslc:at "103"^^xsd:int ] [ a oslc:update ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ; oslc:at "102"^^xsd:int ] [ a oslc:delete ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ; oslc:at "101"^^xsd:int ]) . This is Turtle shorthand for the following. Note that I'm not changing anything here, I'm just expanding the shorthand: @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . <https://.../ChangeLog> oslc:changes _:b1. _:b1 rdf:first [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ; oslc:at "103"^^xsd:int ]. _:b1 rdf:next _:b2 _:b2 rdf:first [ a oslc:update ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ; oslc:at "102"^^xsd:int ]. _:b2 rdf:next _:b3 _:b3 rdf:first [ a oslc:delete ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ; oslc:at "101"^^xsd:int ]. _:b3 rdf:next rdf:nil . Since I have not changed anything, this still can't be paginated in the usual manner. However, if I make the following change, it can be: @prefix oslc: <http://open-services.net/ns/core#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <https://.../ChangeLog#>. <https://.../ChangeLog> oslc:changes :b1. :b1 rdf:first [ a oslc:create ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/23> ; oslc:at "103"^^xsd:int ]. :b1 rdf:rest :b2 :b2 rdf:first [ a oslc:update ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/22> ; oslc:at "102"^^xsd:int ]. :b2 rdf:rest :b3 :b3 rdf:first [ a oslc:delete ; oslc:changed <https://.../com.ibm.team.workitem.WorkItem/21> ; oslc:at "101"^^xsd:int ]. :b3 rdf:rest rdf:nil . Pagination of this resource according to our standard algorithm is a trivial exercise left to the reader. The only special rule of pagination for ChangeLog would be that the entries on a particular page are all older than the entries on the previous page and younger than the ones on the following page. This notation is slightly harder for a human to read than Turtle's () notation, but is neither better nor worse for a computer to read - it produces the same pattern of triples. Note that this change causes https://.../ChangeLog#b1 through b3 to become legitimate URLs. This does not necessarily mean that the OSLC implementer has to honor a GET on these URLs - it's perfectly OK for the URLs to be used in the list but not actually be independently GETable - but the implementation may also choose to honor a GET. This also gives a different and maybe more satisfactory way for a changelog to make a real resource out of each change entry. Frank did the obvious thing of allowing the entry itself to be a resource, but this creates some ambiguities and special rules, as I pointed out in my last note. If instead it was the list nodes that are given URLs, and the entries remain always as blank nodes, it might be easier to understand. In practice, an implementation would be more likely to use https://.../ChangeLog#b103 through https://.../ChangeLog#b101 as URLs for the lists corresponding to entries of the matching sequence number. Best regards, Martin Martin Nally, IBM Fellow CTO and VP, IBM Rational tel: +1 (714)472-2690 |------------> | From: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Frank Budinsky/Toronto/IBM@IBMCA | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | To: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |[email protected] | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Cc: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Martin Nally/Raleigh/IBM@IBMUS, RELM Development | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Date: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |03/17/2011 11:57 AM | >--------------------------------------------------------------------------------------------------------------------------------------------------| |------------> | Subject: | |------------> >--------------------------------------------------------------------------------------------------------------------------------------------------| |Updated ChangeLog Proposal | >--------------------------------------------------------------------------------------------------------------------------------------------------| Hi All, I've uploaded the latest version of the OSLC change log proposal here: http://open-services.net/pub/Main/IndexingProposals/OSLC_indexing_0316.doc It includes changes to reflect discussion and decisions made during the first round of review, including the following: A section to describe the motivating use case. Description of a formal "Indexing Profile" which defines the capabilities that a service provider MUST support in order to be indexable. Proposed formal scope of indexing/changeLog. ChangeLog entry timestamps changed to sequence numbers. ChangeLog entries can optionally be referenced URI-addressable resources. Please send comments and issues to the mailing list. We're also tentatively scheduled to discuss this during the OSLC Core Workgroup call, next week, so hopefully we can hash out most of the remaining issues by then. Thanks, Frank. _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net
