Hi all, One final part of the Indexing/ChangeLog proposal ( http://open-services.net/bin/view/Main/IndexingProposals) that still requires more discussion is the scope and enumeration of the resources being indexed. In Section 1.2.1 of the current proposal document ( http://open-services.net/pub/Main/IndexingProposals/OSLC_indexing_0316.doc) it says:
A fundamental requirement of the Indexing Profile is that a service provider expose its set of indexable resources. This set of resources can, but may not be, all of the resources owned by the service provider. This set of resources defines the scope for the other capabilities of the indexing profile (e.g., the resources reported by the change log). This begs the questions: What exactly are "indexable resources" and how are they "exposed"? I think our answer to the first question is that indexable resources are simply "public resources", i.e., the resources that the service provider does not consider to be internal implementation artifacts. Anything more specific than that starts to put the service provider implementer in the business of guessing what a clients may or may not want to be indexing, or more generally, tracking. (Note that we've been discussing, at last weeks WG meeting and on another thread, that the capabilities we're defining for ChangeLog are really of more general use than just for indexing.) Since we're already planning to rename "Indexing Profile" something more general, I suggest we also use the term "public resources" instead of "indexable resources" to describe what we're exposing/enumerating here. The second question, "how are they exposed", is more interesting. Two goals for this design have been identified during design discussions so far: Decouple this as much as possible from the OLSC service discovery model Make it possible to leverage/reuse OSLC Query Capabilities, if they are available The current version of the proposal does a better job at meeting goal #2 than #1. It says this: In the simplest case, a service provider can provide exactly one Query Capability (i.e., queryBase URI) which includes all of its contained resources. On the other hand, it may instead provide several Query Capabilities, each exposing only a subset of the indexable resources. This is convenient if different types are most easily returned using custom member properties. Fundamentally, what we require in the simplest case is a single URI on which GET can be called to retrieve the complete set of public resources. A queryBase URI is the existing OSLC mechanism that can be used to do basically that (and it also provides the necessary paging mechanism). I was suggesting that we simply reuse the existing way of publishing queryBase URIs, that is using the QueryCapabilties of the ServiceProvider. With this approach, we only need to define one additional mechanism; a way to identify the QueryCapability, among possibly many of a ServiceProvider, that represents the complete set of public resources. However, this simple single queryBase design won't work if a service provider wants to return its public resources using several (e.g., type specific) lists. To support this we allow the service provider to identify several QueryCapabilities, instead of just one. The current proposal uses the obvious mechanism for this, the oslc:usage property of QueryCapability. The current proposal says it like this: The one or more Query Capabilities, of a service provider, that return the indexable resources MUST be designated with an oslc:usage property with a value of http://open-services/ns/core#index. (Note that the exact value of the oslc:usage property will change, probably to something more like http://open-services/ns/core/#publicResources) This design meets goal #2 above, but does a poor job of decoupling the design from the OSLC model, especially Query Capability. It would seem conceptually simpler to just list one or more URIs as the ones that enumerate the public resources of the ServiceProvider. These could be the same URIs that are also used as the queryBase URIs in some QueryCapabiliies, but if no query support is implemented, there would be no QueryCapabilities needed. This simpler model (i.e., a list of queryBase URIs - or even better, a single queryBase URI - as opposed to a marked subset of a ServiceProvider's QueryCapabilities), however, is not quite enough. An important property of a QueryCapability, oslc:resourceShape, is used to specify a type specific member property (instead of rdfs:member) that is used to aggregate the resources. A queryBase URI on it's own does not include a ResourceShape so, if not using QueryCapability, we would need to provide it some other way. A QueryCapaility contains the following properties: dcterms:title exactly-one oslc:label zero-or-one oslc:queryBase exactly-one oslc:resourceShape zero-or-one oslc:resourceType zero-or-many oslc:usage zero-or-many Notice that in addition to oslc:queryBase and oslc:resourceShape, both of which we require, it only includes one additional required property: dcterms:titile. The others are optional (and oslc:resourceType might in fact be useful in our use case). This makes me think it may not be such a bad fit anyway. Otherwise we'd need to provide exactly what we need some other way, even in the (likely) case that a QueryCapability is also providing the same information. Another concern with tying this to QueryCapabilty is that it can be construed to imply that full query support will be required from every ServiceProvider that simply wants to expose its public resources. However, although the OSLC spec is not totally clear on this, it does say that a Query Capability MAY support the default OSLC query syntax or it MAY support some other query syntax. Therefore, I'm assuming it MAY support no query syntax at all. If this is true, then a ServiceProvider is not required to support query at all, even if it does expose queryBase URIs using QueryCapabilites. We'd need to document clearly that we require only the queryBase. No actual query syntax needs to be supported. So, to summarize, there seems to be a fairly clean way to reuse existing mechanisms to enumerate public resources, but it's not totally clear if we should use it, or if we would be better off to come up with something else. Please send me your thoughts. Thanks, Frank.
