Wooly,
I agree with what Dan said. Another approach to customize your import
and export that you may find useful can be to use (and extend
accordingly) a couple of classes found in the contrib section of the
Jackrabbit SVN.
http://svn.apache.org/repos/asf/jackrabbit/trunk/contrib/jcr-ext/src/
main/java/org/apache/jackrabbit/xml
In particular, you may want to look at the DocumentViewExportVisitor
and DocumentViewImportVisitor. You need to understand how SAX parsing
works a bit, but it should be easy enough to customize them to select
what attributes you want to see in your DocumentView export, and how
to map your XML into Jackrabbit nodes and properties.
Incidentally, those classes supports multivalued properties
represented as space separated attributes.
About the SystemView, it is really not meant to be human readable,
more like machine readable (hence the name SystemView), and can be a
bit misleading, because it breaks the node/element mapping that JCR
supports (in the JSR-170 specs is defined as VirtualDocument). In
other words, the xpath expression "/foo/bar" in system view becomes
something like /sv:[EMAIL PROTECTED]:name='foo']/sv:[EMAIL PROTECTED]:name='bar']/
sv:value Or something like that...
In my current project, I have implement a REST type interface to the
JCR, so that the XSLT document() element can be used to access JCR
subtrees, and it is very important to preserve the mapping between
xpath in XML fragments (constructed using the
DocumentViewExportVisitor) and the xpath in the JCR, so the
SystemView does not really apply.
Hope it helps.
Alessandro
On Jun 13, 2007, at 6:16 AM, Dan Connelly wrote:
woolly:
The term "DocumentView" is slightly misleading. Its more like a
Shredded And Annotated Document View.
The xml document will get shredded into its constituent element
nodes when you import it as "DocumentView". This import will not
store a single, coherent document in the Repository. WebDav
support in Jackrabbit, on the other hand, can be used to store the
document as coherent text. Customized, hybrid approaches are
possible to support structured content (partial shredding over
WebDav). It depends on your use case how much (or how little)
shredding you want.
The metadata gets added to raw shreds during DocumentView import to
indicate the Jackrabbit element node type structure. By default,
node type will be nt:unstructured on raw nodes (not having metadata
already). You can write a simple XSLT to strip out the metadata
when you export. For import you can work this in reverse and add
a custom structure using XSLT (but that may not be simple).
It sounds like your use case (customized node editing) requires
some custom node types. This can work nicely if the set of
element tags is limited and fixed. Also, you probably also will
need to add some custom xml processing (dom, sax or xslt).
What xml editor are you using? I think XML Spy has integration
features that would support partial shredding and customized
document views. (But, I have never worked this.)
-- Dan Connelly
woolly wrote:
Hi all,
Is it possible to import xml into a node, and then export that xml
back out
to have the same xml-equivalent file? At the moment I'm trying:
fis = new FileInputStream(inputFile);
session.importXML(node.getPath(), fis,
ImportUUIDBehavior.IMPORT_UUID_CREATE_NEW);
fis.close();
// followed by....
out = new FileOutputStream(outputFile);
session.exportDocumentView(node.getPath(), out, true, false);
The difference between inputFile and outputFile seems to be that
there are
some additional jcr specific attributes. Is this necessary?
What I'm really trying to do is manage an xml document (eventually
many xml
documents), allow people to make changes to only certain parts of it,
versioning those parts and using other JackRabbit features. Is
this the kind
of thing that JackRabbit was intended for? Or should I just load
the xml
document in as a property of a node and deal with the other things
myself?
Thanks for any help,
Phil.