Hi Karl,

Personally, I would choose the shortest way to make things work. ;-) And 
MarkLogic Server doesn't require you to choose between the three. You can 
intermingle if you like as well.

If your current data is following a certain standard, then it is likely that it 
is so for a certain reason. Perhaps it is necessary to be able to exchange data 
with other parties or applications. This is a very strong reason to preserve 
the content in its original format, whether MarkLogic Server can handle that 
well or not. But thanks to namespaces and document properties in MarkLogic 
Server, it is quite easy to add information that is optimized for searching or 
user presentation, to make less optimally structured content work better in 
MarkLogic Server. You can always store calculated data in document properties, 
add namespaced attributes to specific nodes to optimize certain things and 
filter them out when exchanging data with other systems, add meta information 
in a separate xml structure that is inserted in the existing data structure, or 
wrap the contents in a new root element which allows additional information at 
root level. Document properties prevent mingling data, the last solution is one 
in which separating the data is very easy.

But apart from that, it might be just as likely that MarkLogic Server could 
perform really well with the existing structure, if indices and search 
expressions would be chosen carefully. Unfortunately, you leave us in the dark 
why you think solution #2 should dominate entirely over the others. Perhaps you 
could elaborate on that first? And while at it, give us some hints on the big 
picture. What are you trying to achieve in general with MarkLogic Server?

Kind regards,
Geert

>


Drs. G.P.H. Josten
Consultant


http://www.daidalos.nl/
Daidalos BV
Source of Innovation
Hoekeindsehof 1-4
2665 JZ Bleiswijk
Tel.: +31 (0) 10 850 1200
Fax: +31 (0) 10 850 1199
http://www.daidalos.nl/
KvK 27164984
De informatie - verzonden in of met dit emailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.


> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Karl Erisman
> Sent: maandag 23 november 2009 3:14
> To: [email protected]
> Subject: [MarkLogic Dev General] XML structure/schema design for MLS
>
> I have a general question about choosing an XML structure
> (schema design if using schemas) for use with MarkLogic.  My
> particular situation involves storing clinical data.  There
> are multiple opposing forces that could motivate choosing one
> schema structure over another.
>  The main ones are:
>
> (1) standards compliance: it would be nice if the internal
> storage format is compatible with existing standard schemas
> for clinical data in XML (to take advantage of existing tools
> that work against the standard schemas and to allow exchange
> with external systems without requiring transformation)
> (2) ease of handling in MLS, specifically *indexing* and *searching*
> (3) "clean" XML (structure that makes sense semantically to a
> human viewer)
>
> The more I experiment with cts:query and search:search, the
> more I tend to think that #2 should dominate entirely, to the
> point of ignoring the others.  As it turns out, some standard
> data formats are really awkward to work with in MLS.
>
> So, do others just organize their content specifically for
> MLS and run transformations when needed?  What does Mark
> Logic recommend?  What have your experiences been?
>
> Thank you,
> Karl
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general
>

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to