Hello Jim, On Wed, Feb 03, 2021 at 02:25:16PM -0500, Jim Jagielski wrote:
> Funny that you bring this up... I'm been tracking down some bugs and they > all seem to be XML related... :-) [...] > I feel that making AOO more fragile by trying to work around cases where > invalid and/or non-compliant XML is encountered is just wrong. We should > either ignore the error (catch it) or raise an exception. Could you please explain better what you mean by "ignore the error (catch it)"? I understand the word "catch" for exceptions... > Invalid data shouldn't be tolerated. Additionally, trying to be > "lenient" is an easy vector for vulnerabilities. I agree if we speak about the XML code generated by AOO, I mean during the export of XML data. If the save operation fails, an exception should be thrown and the user should be warned, so they can retry saving in another format, or doing copy & paste into another document... I am sure they would rather try many times than just silently risk to lose their data. Once the data is on disk, being able to recover a corrupt file would be IMHO very useful, because users get very upset if their data is suddenly lost because of a bug, and honestly, so would I. As long as errors consist of a missing end tag or a duplicated attribute, I believe we should be able to handle them with little implications for security (for what I can imagine -- I am not a security expert). In a previous thread on this ML "High priority issues", Peter pointed out this bug too: https://bz.apache.org/ooo/show_bug.cgi?id=126927 that can be generalized as "warn the user if there are problems in the input". I would like very much to have such a system in place, for example with ``warnings'' recorded during the XML parsing, and eventually displayed. Users would become worried about the unexpected warning messages emitted during load and save operations, and would come back to us (on user forums, mailing lists, BugZilla etc); this could help them (and us) better address issues, maybe even identify problems before they cause data loss. If you really do not like the "lenient" parsing, we could try to make it an optional second attempt. If the "strict" parsing fails, then the user could be offered to retry with the lenient algorithm, but being warned about possible security implications. I hope I could explain myself clearly. I believe that these topics may have already been discussed in the past; if you want to avoid discussing any already consolidated decisions, please send me the pointers. Best regards, -- Arrigo http://rigo.altervista.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org For additional commands, e-mail: dev-h...@openoffice.apache.org