Hi Sven

Firstly I find that if you're working with lots of data in different formats
that for persistence and on the wire format, use XML text, then to work
programmatically with it, use an XML object model such as dom4j.

More comments nested below...

From: "Sven Behrens" <[EMAIL PROTECTED]>
> I am looking for the ideal internal format of some Metadata in an
> information catalog system.
>
> The (Meta)dataobjects result from a relational database (well known
> schema) and describe environmental information. There is not really any
> business logic in these data objects. In the past I just needed to
> render HTML from these data objects. (However, these data objects are
> quite complex and are build by many dependent SQL queries. So during
> construction I need access to some attributes.)
>
> Now I need
> 1) to render XML from these data objects and
> 2) to render the exact same HTML as before from external XML provided by
> external sources
>
> So for me it seems that I can benefit from an internal XML format
> (considering that I am not really happy with the actual internal data
> format consisting of some kind of entity (no EJB) with atomic- and
> self-made-java.util.Vector-based collection attributes.

Agreed. The great thing about dom4j is that you can use XPath expressions to
extract information from your metadata really easily, without lots of nested
iterations and checks for nulls etc. So if you need to navigate three levels
down into your tree through 1-1, 1-N and N-1 relationships, its really easy
with 1 line of XPath - rather than pages of Java collections or beans
navigation code.


> In the near future we possibly provide a data maintenance tool, too.

As it happens work is underway to build some UI editors and tools that work
with dom4j. Its a little early days to recommend its use yet though ;-)


> One issue is, that due to distributed access (RMI) to many sources, I
> need a transport efficient format. (memory and speed is also an issue,
> of course)
>
> My questions:
>
> As far as I know, dom4j currently does not support serialization at all.
> When will this change and will the serialized format be tuned for
> efficiency? (I tested JDOM beta 6 and got about twice the volume I had
> from the already big XML-source...)

We're going to implement serialization really soon - I might even get it
working on the flight back to London from JavaOne.
What we'll actually do is to just implement Externalization and write out
the textual XML document instead of serializing the XML object model. Its
much faster and smaller to do that. Java serialization is still suprisingly
slow & big.


> Or would you recommend to create an XML-file and use this (e.g.
> compressed) for communication?

I'd use XML text files if I were you - they are human readable and are
easily editable and transformable (e.g. XSLT and so on) and they are much
quicker & smaller than serialization of XML object models.

If size is a concern, try GZipping the stream that carries the XML. That
will result in *much* smaller size data than serialzation.


> Is this use case a typical one for the use of an lightweight XML-API?

Yes - I do this kind of things alot. Increasingly the standard way of
working with data is increasingly becoming XML. Whether its google searches,
NASDAQ quote information, babel fish translations, Reuters news, RSS content
syndication - increasingly data everywhere on the net is becoming available
as XML.

So I'd definitely recommend going the XML direction for your metadata. Then
you can read, write & edit it easily, do powerful navigation with XPath, do
transformations with XSLT and still integrate seamlessly with DOM and SAX if
you need to.


> Do you think, Java-XML-Java Mapper tools (like Jato or Castor) would be
> more appropriate in this use case?

There are a variety of XML <-> Java data binding technologies such as
Castor, JAXB, Jato, Zeus, JOX, Quick and others. These should be considered
when you are working with the same fixed schema alot and find that
performance is an issue - these technologies might help tune the object
model construction. Though I'd only go this route when you've time on your
hands and that you're sure your bottleneck actually is the XML object model
construction. Often your bottleneck is somewhere else ;-)

So I'd always recommend start working with XML using an XML object model
first (such as dom4j) then migrate to code generated beans or doing XML <->
custom bean binding if and when you find you need it. Usually this is mostly
useful in really high available servers doing exactly the same thing with a
fixed schema.

One thing you loose with these kind of XML <-> Java binding tools though,
which is *really* useful, is being able to use XPath expressions to work
easily with your XML document and XSLT to transform it from one format to
another.


James


_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to