Remarks about XML:DB API
Hi, Inspired by the SixDML proposal I've been looking some more into the XMLD:DB API specification(since its partially based on the XML:DB core API spec) and have number of remarks about it, though I did not yet have time to read the specification thoroughly, so expect some more. Unfortunatly I also didn't have enough time to think of alternatives the things I have a problem with. Some general remarks: * Resource and Services are perfectly abstract names but its hard to imagine for a user what they mean. I'm in favor of more specific names, to make it easier for users to imagine what they stand for (I only have to figure out what the right names would be). * As Dare Obasanjo already mentioned the tying of services to collections is not very practical. I think this is definitly something that should be changed. Interface specific remarks: Collection interface * I think the behavior and interface of the getServices method should be changed, because: - Each instance of a service could possibly take up resources, in which case you would want to instantiate those services lazy whenever getService is called. - It's not likely you need them all at once. - If its meant for checking the types of services supported by the collection (though personally I do not think that services should be coupled to collections at all) then it could return only the names of the services it supports. * I'm not quite sure about the use of getResourceCount/getChildCollectionCount, since in the case of X-Hive it involves counting the resources which of course has a bad performance characteristic. CollectionManagementService interface * If think this interface is overkill, why not add the createCollection and removeCollection methods to the CollectionInterface? If not should it then check if the collection it operates on is still open? ResourceSet interface * getResource(long item) will only have a good performance if there's a random access list behind the resource set. * getSize will only have a good performance if there's a list behind the resource set When evaluating queries lazy (not always completely possible: for instance if the end result, or temporary results need to be sorted), you typically do not want to gather results in a list, but return them one by one in using an iterator. What you typically want to prevent is that users use code like this: ResourceSet rs = ...; for (long i = 0; i rs.getSize(); i++) { Resource r = rs.getResource(i); } to iterate over the query results when the query is lazy evaluated. Because this would mean that the result set should first gather al the query results which would essentially mean that the results are iterated twice (and you may not have enough working memory to get all the results from the database). Though of course these methods could be useful when there's a list behind the resource set (for instance when the end result needed to be sorted) in those cases you can request the size without a performance penalty. So maybe some method should be added to see if the resourceset is lazy or not? * getIterator returns a ResourceIterator. I'm more in favor of returning a java.util.Iterator (I don't see the cast that becomes necessary as a problem), and renaming the method to iterator() because that's more like other java interfaces, though I understand that this just a matter of taste, and having an own interface for it could make porting the API to other platforms than java easier. * The ResourceIterator interface If not replaced by java.util.Iterator I would prefer if this interface would have methods named next() and hasNext() instead of nextResource() and hasMoreResources(). An finally I have a question, is there a test suite that tests conformance to the API? Kind regards, Arno de Quaasteniet X-Hive Corporation +31 (0)10 710 86 24 http://www.x-hive.com [EMAIL PROTECTED] -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ --
Re: Remarks about XML:DB API
On Thursday, February 7, 2002, at 02:01 PM, Tom Bradford wrote: Yes... and it shouldn't cause confusion because Services as they're implemented at the moment can't be repointed to other Collections. To a Service, the Collection provides context. It may be a starting context for recursive processing, or it may be a singular context... Depends on the nature of, and how the service is implemented. There's nothing stopping someone from implementing a Service that is tied to the root Collection of the database and operates on the database as a whole, but not allowing the possibility of context would be too restrictive contextually, where naming and implementation flexibility are concerned. The problem comes if there is no root collection. For instance I have an Oracle 9i impl where the collection hierarchy is flat. I had to synthesize a root collection in order to have a starting point to create collections. This isn't intuitive when the database doesn't support a hierarchy of collections. I actually agree with Dare on this, Services tied to collections is too limiting. We need a cleaner distinction of database level services. I don't think all services should be database level, but the concept needs to exist. -- Tom Bradford - http://www.tbradford.org Apache Xindice (Native XML Database) - http://xml.apache.org Project Labrador (Web Services Framework) - http://notdotnet.org -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ -- Kimbro Staken XML Database Software, Consulting and Writing http://www.xmldatabases.org/ -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ --
Re: Remarks about XML:DB API
--- Tom Bradford [EMAIL PROTECTED] wrote: On Thursday, February 7, 2002, at 02:09 PM, Kimbro Staken wrote: The problem comes if there is no root collection. For instance I have an Oracle 9i impl where the collection hierarchy is flat. I had to synthesize a root collection in order to have a starting point to create collections. This isn't intuitive when the database doesn't support a hierarchy of collections. I actually agree with Dare on this, Services tied to collections is too limiting. We need a cleaner distinction of database level services. I don't think all services should be database level, but the concept needs to exist. My only argument is that Collection-level services are needed, and shouldn't be eliminated. I have no problem with adding Database level services. :) This can easily be supported by doing what I did with SiXDML. Just add getService(String, String) to the Database class. = LAWS OF COMPUTER PROGRAMMING, VIII Any non-trivial program contains at least one bug. http://www.25hoursaday.com Carnage4Life (slashdot/advogato/kuro5hin) __ Do You Yahoo!? Send FREE Valentine eCards with Yahoo! Greetings! http://greetings.yahoo.com -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ --
Re: Remarks about XML:DB API
On Thursday, February 7, 2002, at 02:30 PM, Dare Obasanjo wrote: This can easily be supported by doing what I did with SiXDML. Just add getService(String, String) to the Database class. Here's the problem with that though. Imagine you have a program that performs service requests in a generic fashion against Collections that are passed to it. Now furthermore, say you have two collections, one is a collection that is relationally mapped, the other that is native. Because of this, the Service may have to be implemented completely differently. When you request a Service of the same name, you'll be getting back the same interface, but with a different underlying implementation. It's awkward enough that you'd have to query the Collection for its absolute path, and then pass that absolute path to the Database to resolve the Service, but add to that the fact that when you offload Service resolution responsibilities to the Database, you're asking it not only to get a Service, but to get a specific implementation based on the Collection name you're passing to it, which is more responsibility than the Database needs to handle, especially in a system where the collection structure is based on many heterogeneous data sources and implementations. -- Tom Bradford - http://www.tbradford.org Apache Xindice (Native XML Database) - http://xml.apache.org Project Labrador (Web Services Framework) - http://notdotnet.org -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ --
Re: Remarks about XML:DB API
On Thursday, February 7, 2002, at 02:40 PM, Tom Bradford wrote: On Thursday, February 7, 2002, at 02:30 PM, Dare Obasanjo wrote: This can easily be supported by doing what I did with SiXDML. Just add getService(String, String) to the Database class. Here's the problem with that though. Imagine you have a program that performs service requests in a generic fashion against Collections that are passed to it. Now furthermore, say you have two collections, one is a collection that is relationally mapped, the other that is native. Because of this, the Service may have to be implemented completely differently. When you request a Service of the same name, you'll be getting back the same interface, but with a different underlying implementation. It's awkward enough that you'd have to query the Collection for its absolute path, and then pass that absolute path to the Database to resolve the Service, but add to that the fact that when you offload Service resolution responsibilities to the Database, you're asking it not only to get a Service, but to get a specific implementation based on the Collection name you're passing to it, which is more responsibility than the Database needs to handle, especially in a system where the collection structure is based on many heterogeneous data sources and implementations. I don't think he was suggesting that this should be the only way to access collections just an addendum. The one problem I do see with it is that it changes the concept of the Database. In the current API you shouldn't be using the database instance for anything beyond the initial setup. If we move logic like getService into it then you'll actually be using the Database instance in other places as well. Not a major problem, but not as simple as just adding one method. We'd probably need a method on Collection to return the Database instance. Or another option would be to change the getService method to enable specification of what scope the service applies too. I almost like that better. Collection.getService(name, version, scope) where scope is one of three values, database, collection, or hierachy. These could be defined as constants in the Service interface. Hierarchy would apply to the collection and all children of the collection. Either way would work though. -- Tom Bradford - http://www.tbradford.org Apache Xindice (Native XML Database) - http://xml.apache.org Project Labrador (Web Services Framework) - http://notdotnet.org -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ -- Kimbro Staken XML Database Software, Consulting and Writing http://www.xmldatabases.org/ -- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe:mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ --