Remarks about XML:DB API

2002-02-07 Thread Arno de Quaasteniet
Hi,

Inspired by the SixDML proposal I've been looking some more into the
XMLD:DB API specification(since its partially based on the XML:DB core
API spec) and have number of remarks about it, though I did not yet have
time to read the specification thoroughly, so expect some more.
Unfortunatly I also didn't have enough time to think of alternatives the
things I have a problem with.

Some general remarks:
* Resource and Services are perfectly abstract names but its hard to
imagine for a user what they mean. I'm in favor of more specific names,
to make it easier for users to imagine what they stand for (I only have
to figure out what the right names would be).
* As Dare Obasanjo already mentioned the tying of services to
collections is not very practical. I think this is definitly something
that should be changed.

Interface specific remarks:

Collection interface

* I think the behavior and interface of the getServices method should be
changed, because:
- Each instance of a service could possibly take up resources, in which
case you would want to instantiate those services lazy whenever
getService is called. 
- It's not likely you need them all at once.
- If its meant for checking the types of services supported by the
collection (though personally I do not think that services should be
coupled to collections at all) then it could return only the names of
the services it supports.
* I'm not quite sure about the use of
getResourceCount/getChildCollectionCount, since in the case of X-Hive it
involves counting the resources which of course has a bad performance
characteristic.

CollectionManagementService interface

* If think this interface is overkill, why not add the createCollection
and removeCollection methods to the CollectionInterface? If not should
it then check if the collection it operates on is still open?

ResourceSet interface

* getResource(long item) will only have a good performance if there's a
random access list behind the resource set.
* getSize will only have a good performance if there's a list behind the
resource set

When evaluating queries lazy (not always completely possible: for
instance if the end result, or temporary results need to be sorted), you
typically do not want to gather results in a list, but return them one
by one in using an iterator. 

What you typically want to prevent is that users use code like this:

ResourceSet rs = ...;
for (long i = 0; i  rs.getSize(); i++) {
Resource r = rs.getResource(i);
} 

to iterate over the query results when the query is lazy evaluated.
Because this would mean that the result set should first gather al the
query results which would essentially mean that the results are iterated
twice (and you may not have enough working memory to get all the results
from the database).

Though of course these methods could be useful when there's a list
behind the resource set (for instance when the end result needed to be
sorted) in those cases you can request the size without a performance
penalty.

So maybe some method should be added to see if the resourceset is lazy
or not?

* getIterator returns a ResourceIterator. I'm more in favor of returning
a java.util.Iterator (I don't see the cast that becomes necessary as a
problem), and renaming the method to iterator() because that's more like
other java interfaces, though I understand that this just a matter of
taste, and having an own interface for it could make porting the API to
other platforms than java easier. 

* The ResourceIterator interface  
If not replaced by java.util.Iterator I would prefer if this interface
would have methods named next() and hasNext() instead of nextResource()
and hasMoreResources().

An finally I have a question, is there a test suite that tests
conformance to the API?

Kind regards,

Arno de Quaasteniet
X-Hive Corporation
+31 (0)10 710 86 24
http://www.x-hive.com
[EMAIL PROTECTED]
 
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--


Re: Remarks about XML:DB API

2002-02-07 Thread Kimbro Staken
On Thursday, February 7, 2002, at 02:01 PM, Tom Bradford wrote:
Yes... and it shouldn't cause confusion because Services as they're 
implemented at the moment can't be repointed to other Collections.  To a 
Service, the Collection provides context.  It may be a starting context 
for recursive processing, or it may be a singular context... Depends on 
the nature of, and how the service is implemented.  There's nothing 
stopping someone from implementing a Service that is tied to the root 
Collection of the database and operates on the database as a whole, but 
not allowing the possibility of context would be too restrictive 
contextually, where naming and implementation flexibility are concerned.

The problem comes if there is no root collection. For instance I have an 
Oracle 9i impl where the collection hierarchy is flat. I had to synthesize 
a root collection in order to have a starting point to create collections.
 This isn't intuitive when the database doesn't support a hierarchy of 
collections. I actually agree with Dare on this, Services tied to 
collections is too limiting. We need a cleaner distinction of database 
level services. I don't think all services should be database level, but 
the concept needs to exist.

--
Tom Bradford - http://www.tbradford.org
Apache Xindice (Native XML Database) - http://xml.apache.org
Project Labrador (Web Services Framework) - http://notdotnet.org
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--

Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--


Re: Remarks about XML:DB API

2002-02-07 Thread Dare Obasanjo

--- Tom Bradford [EMAIL PROTECTED] wrote:
 On Thursday, February 7, 2002, at 02:09 PM, Kimbro
 Staken wrote:
  The problem comes if there is no root collection.
 For instance I have 
  an Oracle 9i impl where the collection hierarchy
 is flat. I had to 
  synthesize a root collection in order to have a
 starting point to 
  create collections.
   This isn't intuitive when the database doesn't
 support a hierarchy of 
  collections. I actually agree with Dare on this,
 Services tied to 
  collections is too limiting. We need a cleaner
 distinction of database 
  level services. I don't think all services should
 be database level, 
  but the concept needs to exist.
 
 My only argument is that Collection-level services
 are needed, and 
 shouldn't be eliminated.  I have no problem with
 adding Database level 
 services.

:) 

This can easily be supported by doing what I did with
SiXDML. Just add getService(String, String) to the
Database class. 

=
LAWS OF COMPUTER PROGRAMMING, VIII  
Any non-trivial program contains at least one bug. 
http://www.25hoursaday.com   
Carnage4Life (slashdot/advogato/kuro5hin)

__
Do You Yahoo!?
Send FREE Valentine eCards with Yahoo! Greetings!
http://greetings.yahoo.com
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--


Re: Remarks about XML:DB API

2002-02-07 Thread Tom Bradford
On Thursday, February 7, 2002, at 02:30 PM, Dare Obasanjo wrote:
This can easily be supported by doing what I did with
SiXDML. Just add getService(String, String) to the
Database class.
Here's the problem with that though.  Imagine you have a program that 
performs service requests in a generic fashion against Collections that 
are passed to it.  Now furthermore, say you have two collections, one is 
a collection that is relationally mapped, the other that is native.  
Because of this, the Service may have to be implemented completely 
differently.  When you request a Service of the same name, you'll be 
getting back the same interface, but with a different underlying 
implementation.

It's awkward enough that you'd have to query the Collection for its 
absolute path, and then pass that absolute path to the Database to 
resolve the Service, but add to that the fact that when you offload 
Service resolution responsibilities to the Database, you're asking it 
not only to get a Service, but to get a specific implementation based on 
the Collection name you're passing to it, which is more responsibility 
than the Database needs to handle, especially in a system where the 
collection structure is based on many heterogeneous data sources and 
implementations.

--
Tom Bradford - http://www.tbradford.org
Apache Xindice (Native XML Database) - http://xml.apache.org
Project Labrador (Web Services Framework) - http://notdotnet.org
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--


Re: Remarks about XML:DB API

2002-02-07 Thread Kimbro Staken
On Thursday, February 7, 2002, at 02:40 PM, Tom Bradford wrote:

On Thursday, February 7, 2002, at 02:30 PM, Dare Obasanjo wrote:
This can easily be supported by doing what I did with
SiXDML. Just add getService(String, String) to the
Database class.
Here's the problem with that though.  Imagine you have a program that 
performs service requests in a generic fashion against Collections that 
are passed to it.  Now furthermore, say you have two collections, one is 
a collection that is relationally mapped, the other that is native.  
Because of this, the Service may have to be implemented completely 
differently.  When you request a Service of the same name, you'll be 
getting back the same interface, but with a different underlying 
implementation.

It's awkward enough that you'd have to query the Collection for its 
absolute path, and then pass that absolute path to the Database to 
resolve the Service, but add to that the fact that when you offload 
Service resolution responsibilities to the Database, you're asking it not 
only to get a Service, but to get a specific implementation based on the 
Collection name you're passing to it, which is more responsibility than 
the Database needs to handle, especially in a system where the collection 
structure is based on many heterogeneous data sources and implementations.

I don't think he was suggesting that this should be the only way to access 
collections just an addendum.

The one problem I do see with it is that it changes the concept of the 
Database. In the current API you  shouldn't be using the database instance 
for anything beyond the initial setup. If we move logic like getService 
into it then you'll actually be using the Database instance in other 
places as well. Not a major problem, but not as simple as just adding one 
method. We'd probably need a method on Collection to return the Database 
instance. Or another option would be to change the getService method to 
enable specification of what scope the service applies too. I almost like 
that better.

Collection.getService(name, version, scope) where scope is one of three 
values, database, collection, or hierachy. These could be defined as 
constants in the Service interface. Hierarchy would apply to the 
collection and all children of the collection.

Either way would work though.

--
Tom Bradford - http://www.tbradford.org
Apache Xindice (Native XML Database) - http://xml.apache.org
Project Labrador (Web Services Framework) - http://notdotnet.org
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--

Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/
--
Post a message: mailto:[EMAIL PROTECTED]
Unsubscribe:mailto:[EMAIL PROTECTED]
Contact administrator:  mailto:[EMAIL PROTECTED]
Read archived messages: http://archive.xmldb.org/
--