This would be a great tool, and a project for "someone".
But I fear from my experience with various XML diffing tools that it's a
lot harder than meets the eye.
Diffing of flat files is fairly well established techniques, but even
that is fraught with errors.
But diffing a hierarchical system, especially when node identity does
not cross documents is very tricky to get "right".  The first part is
defining exactly what "right" is.

For example you quote "Any change in the sequence is to be ignored."
That is definitely something that I wouldnt consider "right" for a
general purpose tool
( I would want to know  if  Doc 1 is "A,B" and doc2 is "B,A" ).
And is that 2 modifications or 2 deletes and 2 inserts? 

I've used various XML Diffs in various tools and libraries and was never
quite happy with any of them.

But OTOH, having a public domain algorithm implemented in pure XQuery
would be *awesome*.

As an interesting coincidence I just had to write a simple version of
diff, taking XML dumps of a MarkLogic directory, and a filesystem
directory and generating a sequence of "insert" , "update" , "delete"
objects ...
but my solution is by no way general purpose.


----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected]
812-482-5224




-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Narayanan,
Abishek (LNG-CON)
Sent: Tuesday, May 04, 2010 5:02 PM
To: General Mark Logic Developer Discussion
Cc: [email protected]
Subject: Re: [MarkLogic Dev General] XML diff (Andrew_Redhead)

Hello Geert,

          I am looking for a similar XML difference algorithm in Xquery
that is similar to what xdiff.jar does in Java. On passing 2 XML
documents which follow the same schema I expect the Xquery to specify
what were the nodes/elements which have been added and what are the
nodes/elements deleted or modified. Any change in the sequence is to be
ignored. I would say its more of a logical difference that I expect
between the two XML documents.


For eg. Lets say the XML A is the initial version of a XML and XML B is
updated version of XML A, I should be able to prepare a report on which
I should be able to find out what are all the new XML tags which have
been added, what XML tags have been modified and what XML tags have been
deleted.

 

Regards,

AN

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Geert
Josten
Sent: Monday, May 03, 2010 2:39 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] XML diff (Andrew_Redhead)

Hi Abishek,

Can you give a bit more insight into what you would like to achieve with
this algorithm? There are plenty tools readily available that can do
various kinds of xml diffing..

Kind regards,
Geert

>


drs. G.P.H. (Geert) Josten
Consultant


Daidalos BV
Hoekeindsehof 1-4
2665 JZ Bleiswijk

T +31 (0)10 850 1200
F +31 (0)10 850 1199

mailto:[email protected]
http://www.daidalos.nl/

KvK 27164984

P Please consider the environment before printing this mail.
De informatie - verzonden in of met dit e-mailbericht - is afkomstig van
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u
dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te
verwijderen. Aan dit bericht kunnen geen rechten worden ontleend.

> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Narayanan, Abishek (LNG-CON)
> Sent: maandag 3 mei 2010 20:11
> To: [email protected]
> Subject: Re: [MarkLogic Dev General] XML diff (Andrew_Redhead)
>
> Hello,
>
>      I was looking at this mail chain (more than 2 yrs old
> )which talks about the XML diff algorithm creation. I was
> wondering if anyone can share more information about the
> same. I am trying to achieve something similar.
>
>
>
> Thanks
>
> Abishek
>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to