[xml] Nasty DTD parsing bug (IO buffering, perhaps?)
Hi, Here is a DTD parsing bug in libxml2 (tested with 2.6.27). Download the following .tar.gz: http://www.princexml.com/download/nasty-libxml2-dtd-bug.tar.gz Unpack it and run: $ xmllint --loaddtd bug.xml You will get lots of error messages, the first one being: nlm/references.ent:381: parser error : Comment not terminated However if you look at the file, you will see that is nonsense, and there are no unterminated comments on line 381. Even worse, if you delete *one character* from the references.ent file at *any point* before line 381, then everything works fine! This appears to be some kind of IO buffering error or something like that, as the parser seems to be dependent on how many characters are in the file before that point. Best regards, Michael -- Print XML with Prince! http://www.princexml.com ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] xsltproc: weired behaviour with parsing freedesktop.org XML shared-mime-info database (bug?)
Am Dienstag, den 06.02.2007, 21:08 +0100 schrieb Daniel Leidert: > Hello, > > I observe a really weired behaviour here. See the attached stylesheet > and process it to the shared-mime-info database (normally > $datadir/mime/packages/freedesktop.org.xml). If I process my own XML > file, with a similar (but not the same) DTD, containing an identical > glob-element, it works. Processing it to the shared-mime-info db, does > not give any output. What's the problem here? The shared-mime-info DB is > a valid XML file. So what is happening here? Could you help to explain > it to me? Maybe I'm just too dumb or I over-read something. Dooh! [..] http://www.freedesktop.org/standards/shared-mime-info";> [..] Got it. My fault. Oversaw this. Regards, Daniel ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] xsltproc: weired behaviour with parsing freedesktop.org XML shared-mime-info database (bug?)
Hello, I observe a really weired behaviour here. See the attached stylesheet and process it to the shared-mime-info database (normally $datadir/mime/packages/freedesktop.org.xml). If I process my own XML file, with a similar (but not the same) DTD, containing an identical glob-element, it works. Processing it to the shared-mime-info db, does not give any output. What's the problem here? The shared-mime-info DB is a valid XML file. So what is happening here? Could you help to explain it to me? Maybe I'm just too dumb or I over-read something. Thanks and regards, Daniel test.xsl Description: application/xslt ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Python documentation - any help welcome!
Hi John, thanks for the comments, >> If I had any suggestions it would be to intersperse working python code examples for common operations in with the explanatory prose. I have been doing that throughout the docs - it's quite a bit easier to read on the Wiki, and I would welcome any edits... ;-) http://mikekneller.com/wiki/index.php?title=Getting_started_with_Libxml2_and_Python_-_part_1 At the moment, it's basically a "part 1 of x" so the examples are pretty trivial although in my experience, useful for most simple uses of the library. I agree with the idea of making it a sort of cookbook though. Any recipes are welcome here! Cheers Mike ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Python documentation - any help welcome!
Hi Mike: I've used the libxml2 python bindings a fair bit and this is a good start on documenting them. There is a bit of a learning curve but I think that has more to do with learning libxml2 and less with the python bindings, but that said it's still nice to see python specific documentation. If I had any suggestions it would be to intersperse working python code examples for common operations in with the explanatory prose. I think a lot of folks just quickly want to know how to do basic tasks, a sort of cookbook FAQ. e.g. how do I parse a doc and find all foobar elements and return a list of them? how do I build complex python objects by parsing an XML doc? how can I serialize python objects into XML? etc. The examples can illustrate basic concepts in libxml2. -- John Dennis <[EMAIL PROTECTED]> Learn. Network. Experience open source. Red Hat Summit San Diego | May 9-11, 2007 Learn more: http://www.redhat.com/promo/summit/2007 ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] libxml + sax + schema validation
On Tue, Feb 06, 2007 at 03:33:57PM +0100, Jovan Kostovski wrote: > Hi, > > I need to write a sax xml parser that will > validate the contents against a xml schema file. > > Writing a sax parser wasn't to hard, but I have no > clue how to implement the schema validation. > > Can anyone help me? > Links to some examples would be great there is an API to push a schemas validation context on top of a SAX event, see xmlSchemaValidateStream() use in testSAX() in xmllint.c Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ [EMAIL PROTECTED] | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] libxml + sax + schema validation
Hi, I need to write a sax xml parser that will validate the contents against a xml schema file. Writing a sax parser wasn't to hard, but I have no clue how to implement the schema validation. Can anyone help me? Links to some examples would be great BR, Jovan ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Python documentation - any help welcome!
Mike Kneller <[EMAIL PROTECTED]> writes: > Hi, > > After struggling to get to grips with Libxml2 and Python, I figured > that although I can't contribute much in the way of code, I can have > a crack at getting some useful documentation up together. > > I have put the first part up on my Wiki, if anyone would care to > review for accuracy - or help out where it is a bit light on > examples? > > http://mikekneller.com/wiki/index.php?title=Getting_started_with_Libxml2_and_Python_-_part_1 > > I realise that this is probably a bit n00b for most here, but I > would like to bring together workable examples from the ground up, > most of the other information I have read assumes a level of > knowledge I just didn't have when I encountered the library for the > first time. Hey! I didn't see this till just now. I'm doing a *lot* with libxml2/libxslt and python. I'll take a look at your doc and let you know what I think. -- Nic Ferrier http://www.tapsellferrier.co.uk for all your tapsell ferrier needs ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org http://mail.gnome.org/mailman/listinfo/xml
[xml] Python documentation - any help welcome!
Hi, After struggling to get to grips with Libxml2 and Python, I figured that although I can't contribute much in the way of code, I can have a crack at getting some useful documentation up together. I have put the first part up on my Wiki, if anyone would care to review for accuracy - or help out where it is a bit light on examples? http://mikekneller.com/wiki/index.php?title=Getting_started_with_Libxml2_and_Python_-_part_1 I realise that this is probably a bit n00b for most here, but I would like to bring together workable examples from the ground up, most of the other information I have read assumes a level of knowledge I just didn't have when I encountered the library for the first time. For reference, I'll post the text here. Cheers Mike === Getting started with Libxml2 and Python - Part 1 === Overview Getting to grips with Libxml2 and Python can be a frustrating experience, particularly as in-depth, accurate Python documentation is hard to find on the Web. Many Python developers dislike the Libxml2 bindings, as they are 'un-Pythonic' and much too C-like. This however misses the point of Libxml2. The point is that this library is portable, mature, extremely full-featured and *very* fast. In the process of writing this tutorial, I hung out in the #xml channel on irc.gnome.org, and subscribed to the xml@gnome.org mailing list - I was given a lot of help when things weren't obvious! Although there's not a massive amount of activity on IRC, or in the mailing list on a daily basis, I would definitely recommend spending some time browsing the archive - or using Google to search it when you have questions. Additionally, I have found the people in the Libxml2 community very helpful. Manipulating XML using Libxml2 is fairly straightforward when you have a couple of working examples, however that tends to be the problem in Python. Finding working examples tends to be a bit of a hit-and-miss affair. The first place to look is in the examples folder in the documentation installed with your release (/usr/share/doc/libxml2-python-2.6.27/examples on my machine). TODO: where are the examples on a number of distributions/platforms? Also, take a moment to scan through libxml2.py itself - this is the Python wrapper and is a good place to look if you are hunting for a particular function. There is plenty of information in the wrapper as all the docstrings have been populated, you can always get information like print libxml2.parseFile.__doc__ for any particular function. Also remember that you can list the available methods for any Python object by using the dir function. The most immediately useful objects are xmlCore, xmlNode xmlDoc, so dir(libxml2.xmlCore) is your friend when working out what functions are available to you. I'm going to assume that you know a bit about XML, at least enough to recognise an XML document when you see one, and hopefully enough about Python to know where to find the documentation! [installing Libxml2] TODO: installation examples for a number of distros/platforms. [Loading a document] The first thing you want to do in XML will be to load a document of some sort. As a new Libxml2 user, this is where our confusion starts! It is worth remembering that in general, the Python bindings are automatically generated - therefore there is an equivalent Python function for every C function, and sometimes this can lead to unnecessary, or apparently duplicated Python functions. The library contains a number of different functions we can use to load an XML document: parseDoc, parseFile, parseMemory, readDoc, readFd, readFile, readMemory, recoverDoc and recoverFile All of these functions return an xmlDoc object. Examples for using each of these follow: parseDoc(cur) - load an XML document from memory (a string) doc = libxml2.parseDoc(""" Hello world!""") parseMemory(buffer, size) - load an XML document from memory doc = libxml2.parseMemory(xml, len(xml)) This function performs exactly the same job as parseDoc from a Python perspective. parseFile(filename) - load an XML document from a file doc = libxml2.parseFile('test.xml') readDoc(cur, URL, encoding, options) - load an XML document from memory (a string) This version of the function allows you to specify options on a per-document basis. The parseDoc version uses the parser defaults (in practice, the parser global settings, which can also be modified using global functions). In most cases, doc = libxml2.readDoc('',None,None,0) will be equivalent to doc = libxml2.parseDoc('') When using XSL, I have found it better to force entities to be resolved before running the transform, in which case it is useful to use the following: doc = libxml2.readDoc( xml, None, libxm