Re: second impl of metadata storage

Murray Altheim 11 Dec 2002 17:15:27 -0000

Gianugo Rabellino wrote:

Murray Altheim wrote:
 >>     1. XNode API
 >>     2. (Dave Viner's) Metadata implementation
 >>     3. JMI-in-Xindice
 >
 > I think that these are all way of *accessing* metadata, where I agree
 > that we have a pluggability spot. My concern, instead, is about low end
 > metadata storage which is totally missing and badly needed.
No, I think all three are potentially a mix of both how metadata is
accessed and how it is stored. There seems to be two methods for
storage:
  1. within the database "record" itself, ie., as part of the stored
     content of an XML node,
  2. as a separate XML document
This looks to me more like an implementation detail. It doesn't really matter where you store your data/metadata (in fact even the document itself might be physically scattered over different places). Even with XNode, you could easily have a separate document for metadata and another one for data: it would be just a matter of aggregating the two upon any request (or disassembling them if an XNode comes in). Actually, I think it would be possible, given Dave's implementation, to write an XNode wrapper around it. This is why I'd be happy to see a mimimal metadata support making its way into the codebase.

Again, I don't consider myself an expert on db design, but it seems
to me that there is a big difference. With metadata separate from
the stored document, you introduce performance and metadata-data
alignment issues. With metadata-in-wrapper, you introduce namespace
issues. Positively, with separate metadata there are no namespace
issues, you could potentially store multiple metadata instances.
For metadata-in-wrapper, metadata alignment is not an issue (if you
open a record, you have already opened its metadata). I don't think
of these as implementation details.

OTOH, I do agree that it's possible to implement metadata in either
method. It's just that once you commit to something that modifies
the basic core, it seems you're stuck with that methodology, so in
the end the API isn't really what's at issue -- you could write (as
you say) any number of APIs to that same metadata.

While you in another message state that databases are all about both
data and metadata, that's not necessarily the case. Xindice could be
released out the door without metadata built-in. Given a choice between
an inappropriate metadata solution and none, some may choose none. If
the solution provided is a series of optional plugins, appropriate to
different applications, people may then choose the solution that best
fits their needs. For example, if *neither* #1 nor #2 solutions are
appropriate, say, somebody *requires* a JMI-compliant solution, then
even if we don't provide it they could build it themselves (so much
the better of course if we provided it). Someone else may want an
RDF-based solution, for use with RDF systems. Et cetera.
I don't think so. Think about SQL/JDBC, where you are given a set of "standard" and basic metadata. If that's not enough for your application, you just glue some logic (more tables, triggers) so that you get what you're missing. Here it's just the same: you get the basic metadata (the ones that make sense in a document, like creation/modification timestamps and so on), you get a "property-like" way to set custom (simple) key/value metadata, and you have all the power of XML in another custom section. So you have almost what you might need to store any kind of metadata, not to mention that you can always write your peculiar logic. If we are to catch the "one size fits all" we're not getting anywhere...

Yes, that's the way I designed XNode: the "standard" metadata is
creation and modification date, which are stored as attributes,
whereas the name-value pairs are stored as property elements in
the XNode <xnode:Header> element, thus its extensibility.

I agree that extremes are to be avoided. But I don't see that wall
being hit very soon. I was (for example) able to create a simple
metadata API without messing *at all* with the internals of Xindice,
and I don't class myself as any sort of remarkable programmer.
Neither do I. This is why I really don't understand how did you manage to create an API implementation without touching the core. The idea of (many) metadata is that they have to be updated automatically as a reaction to evetns in a database (such as creation, modification, deletion). I see no other way but going into the core for that. How did you solve this issue?

I'll tonight wrap up javadocs for both the API and the implementation
and get them onto the web so this won't be so mysterious. Basically,
since all the metadata is stored in the wrapper, I'm simply creating
or modifying it at the same time as I create or modify the record. I
have an XNodeStoreImpl object that manages the process. It's 48K of
Java code (with docs) so it's not that complicated.

I'd hate to commit
this early in the communications/designs about an internal API
that we'd then be stuck supporting. Look at some of the really bad
decisions made in the DOM design that should be changed now, but
can't due to installed base. Yuck -- something to avoid.

Yes, definitely I hear that, it's a legitimate concern. But we have to decide on that, sooner or later.

I take it you mean for 2.0. Yes, agreed, *if* there needs in the end
to be modifications to core. I didn't think I was performing any
magic in my design/implementation, but perhaps after posting the
API you can figure out if what I did was (a) unique and valuable,
(b) one of many possible ways, or (c) a bad idea, and if the
metadata approach chosen could take the same path.

[It must be the end of the day -- I don't feel I'm being all that
coherent...]

Murray

......................................................................
Murray Altheim                  <http://kmi.open.ac.uk/people/murray/>
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK

           If you're the first person in a new territory,
           you're likely to get shot at.
                                                    -- ma

Re: second impl of metadata storage

Reply via email to