Thank you all for your comments. I ended up saving the word document in XML and then using (a slightly modified version of) my script of the OP. For those interested, there was also a problem with encodings.
Regards, antoine -- http://mail.python.org/mailman/listinfo/python-list