On Thursday, February 7, 2002, at 05:38 AM, Arno de Quaasteniet wrote:
Hi,
Inspired by the SixDML proposal I've been looking some more into the XMLD:DB API specification(since its partially based on the XML:DB core API spec) and have number of remarks about it, though I did not yet have time to read the specification thoroughly, so expect some more. Unfortunatly I also didn't have enough time to think of alternatives the things I have a problem with.
Some general remarks: * Resource and Services are perfectly abstract names but its hard to imagine for a user what they mean. I'm in favor of more specific names, to make it easier for users to imagine what they stand for (I only have to figure out what the right names would be).
I'd like to hear some suggestions as this is something we toiled over a fair bit in the beginning. However, I'll also say it hasn't really been a problem. We've had hundreds of people use the API through Xindice and the naming hasn't seemed to cause any confusion. In fact I'm kind of surprised at how easily people picked up on it.
* As Dare Obasanjo already mentioned the tying of services to collections is not very practical. I think this is definitly something that should be changed.
Yes, we need some changes here.
Interface specific remarks:
Collection interface
* I think the behavior and interface of the getServices method should be changed, because: - Each instance of a service could possibly take up resources, in which case you would want to instantiate those services lazy whenever getService is called. - It's not likely you need them all at once. - If its meant for checking the types of services supported by the collection (though personally I do not think that services should be coupled to collections at all) then it could return only the names of the services it supports.
We originally had a separate method to check for the existence of a service and it was decided later that it was not really necessary. Your point about the potential for heavy services is a valid one though so you may be right that the mechanism needs to be refined.
* I'm not quite sure about the use of getResourceCount/getChildCollectionCount, since in the case of X-Hive it involves counting the resources which of course has a bad performance characteristic.
Unfortunately the functionality is needed to build usable tools.
CollectionManagementService interface
* If think this interface is overkill, why not add the createCollection and removeCollection methods to the CollectionInterface? If not should it then check if the collection it operates on is still open?
Not all databases can use that interface, it's too simplistic for something like Tamino where schemas are required. I added it just to have something that was usable for simple cases, so it's optional.
ResourceSet interface
* getResource(long item) will only have a good performance if there's a random access list behind the resource set. * getSize will only have a good performance if there's a list behind the resource set
Optimize this and that's where you get competitive advantage. :-)
When evaluating queries lazy (not always completely possible: for instance if the end result, or temporary results need to be sorted), you typically do not want to gather results in a list, but return them one by one in using an iterator.
What you typically want to prevent is that users use code like this:
ResourceSet rs = ...; for (long i = 0; i < rs.getSize(); i++) { Resource r = rs.getResource(i); }
to iterate over the query results when the query is lazy evaluated. Because this would mean that the result set should first gather al the query results which would essentially mean that the results are iterated twice (and you may not have enough working memory to get all the results from the database).
Again this is an implementation detail. There is no reason that the getSize operation has to be calculated from the contents of the result set.
It could easily be provided by the database. Doing that would allow lazy retrieval of results.
Though of course these methods could be useful when there's a list behind the resource set (for instance when the end result needed to be sorted) in those cases you can request the size without a performance penalty.
So maybe some method should be added to see if the resourceset is lazy or not?
What would be the use case for this?
* getIterator returns a ResourceIterator. I'm more in favor of returning a java.util.Iterator (I don't see the cast that becomes necessary as a problem), and renaming the method to iterator() because that's more like other java interfaces, though I understand that this just a matter of taste, and having an own interface for it could make porting the API to other platforms than java easier.
As Tom already pointed out the API is intended to be as language independent as possible. This is a big source of compromises, i.e. things like error codes instead of a collection hierarchy but necessary because we're specifying in IDL. We are a little loose with it though because we use things like DOM and SAX which aren't always precisely defined for other languages. They do exist in other languages though.
* The ResourceIterator interface If not replaced by java.util.Iterator I would prefer if this interface would have methods named next() and hasNext() instead of nextResource() and hasMoreResources().
An finally I have a question, is there a test suite that tests conformance to the API?
Yes, though neither the API or the test suite is complete. If you download the reference impl there is a test suite as well as a set of base classes that can be used to make driver development easier.
Kind regards,
Arno de Quaasteniet X-Hive Corporation +31 (0)10 710 86 24 http://www.x-hive.com [EMAIL PROTECTED]
---------------------------------------------------------------------- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ ----------------------------------------------------------------------
Kimbro Staken XML Database Software, Consulting and Writing http://www.xmldatabases.org/
---------------------------------------------------------------------- Post a message: mailto:[EMAIL PROTECTED] Unsubscribe: mailto:[EMAIL PROTECTED] Contact administrator: mailto:[EMAIL PROTECTED] Read archived messages: http://archive.xmldb.org/ ----------------------------------------------------------------------