This would be a great tool, and a project for "someone". But I fear from my experience with various XML diffing tools that it's a lot harder than meets the eye. Diffing of flat files is fairly well established techniques, but even that is fraught with errors. But diffing a hierarchical system, especially when node identity does not cross documents is very tricky to get "right". The first part is defining exactly what "right" is.
For example you quote "Any change in the sequence is to be ignored." That is definitely something that I wouldnt consider "right" for a general purpose tool ( I would want to know if Doc 1 is "A,B" and doc2 is "B,A" ). And is that 2 modifications or 2 deletes and 2 inserts? I've used various XML Diffs in various tools and libraries and was never quite happy with any of them. But OTOH, having a public domain algorithm implemented in pure XQuery would be *awesome*. As an interesting coincidence I just had to write a simple version of diff, taking XML dumps of a MarkLogic directory, and a filesystem directory and generating a sequence of "insert" , "update" , "delete" objects ... but my solution is by no way general purpose. ---------------------------------------- David A. Lee Senior Principal Software Engineer Epocrates, Inc. [email protected] 812-482-5224 -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Narayanan, Abishek (LNG-CON) Sent: Tuesday, May 04, 2010 5:02 PM To: General Mark Logic Developer Discussion Cc: [email protected] Subject: Re: [MarkLogic Dev General] XML diff (Andrew_Redhead) Hello Geert, I am looking for a similar XML difference algorithm in Xquery that is similar to what xdiff.jar does in Java. On passing 2 XML documents which follow the same schema I expect the Xquery to specify what were the nodes/elements which have been added and what are the nodes/elements deleted or modified. Any change in the sequence is to be ignored. I would say its more of a logical difference that I expect between the two XML documents. For eg. Lets say the XML A is the initial version of a XML and XML B is updated version of XML A, I should be able to prepare a report on which I should be able to find out what are all the new XML tags which have been added, what XML tags have been modified and what XML tags have been deleted. Regards, AN -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Geert Josten Sent: Monday, May 03, 2010 2:39 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] XML diff (Andrew_Redhead) Hi Abishek, Can you give a bit more insight into what you would like to achieve with this algorithm? There are plenty tools readily available that can do various kinds of xml diffing.. Kind regards, Geert > drs. G.P.H. (Geert) Josten Consultant Daidalos BV Hoekeindsehof 1-4 2665 JZ Bleiswijk T +31 (0)10 850 1200 F +31 (0)10 850 1199 mailto:[email protected] http://www.daidalos.nl/ KvK 27164984 P Please consider the environment before printing this mail. De informatie - verzonden in of met dit e-mailbericht - is afkomstig van Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend. > From: [email protected] > [mailto:[email protected]] On Behalf Of > Narayanan, Abishek (LNG-CON) > Sent: maandag 3 mei 2010 20:11 > To: [email protected] > Subject: Re: [MarkLogic Dev General] XML diff (Andrew_Redhead) > > Hello, > > I was looking at this mail chain (more than 2 yrs old > )which talks about the XML diff algorithm creation. I was > wondering if anyone can share more information about the > same. I am trying to achieve something similar. > > > > Thanks > > Abishek > > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
