https://issues.apache.org/bugzilla/show_bug.cgi?id=49020

--- Comment #3 from Nick Burch <[email protected]> 2010-03-31 11:14:53 
UTC ---
The bug is really with Excel here - it has generated a file with invalid XML.
The xlsx file is defined as being made up of XML subparts, and the XML spec is
very very strict on matching tags.

For the long term, you should report a bug to Microsoft about this. They either
need to sanitise the user input and sort out the tags (eg <br> becomes <br />),
or they need to give up and escape the whole tag contents for the bits where
iffy data could get added (eg put this textbox within a CDATA section)

Short term, you could just comment out the code that reads in the vmlDrawing
section of the file, and ensure that you don't touch the drawing records

Medium term, we should get a list of the problem bits that Excel does wrong,
such as <br> (but perhaps others). Then, we need to write a XML Input Wrapper
that cleans these up before they get passed to the XML Processor for loading.
Something like this is quite nasty, though it's possible some other project out
there has already done it, and we can just re-use what they do.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to