Hello Jim,

On Wed, Feb 03, 2021 at 02:25:16PM -0500, Jim Jagielski wrote:

> Funny that you bring this up... I'm been tracking down some bugs and they
> all seem to be XML related...

:-)

[...]
> I feel that making AOO more fragile by trying to work around cases where
> invalid and/or non-compliant XML is encountered is just wrong. We should
> either ignore the error (catch it) or raise an exception.

Could you please explain better what you mean by "ignore the error
(catch it)"? I understand the word "catch" for exceptions...

> Invalid data shouldn't be tolerated. Additionally, trying to be
> "lenient" is an easy vector for vulnerabilities.

I agree if we speak about the XML code generated by AOO, I mean during
the export of XML data.

If the save operation fails, an exception should be thrown and the
user should be warned, so they can retry saving in another format, or
doing copy & paste into another document... I am sure they would
rather try many times than just silently risk to lose their data.

Once the data is on disk, being able to recover a corrupt file would
be IMHO very useful, because users get very upset if their data is
suddenly lost because of a bug, and honestly, so would I.

As long as errors consist of a missing end tag or a duplicated
attribute, I believe we should be able to handle them with little
implications for security (for what I can imagine -- I am not a
security expert).

In a previous thread on this ML "High priority issues", Peter pointed
out this bug too: https://bz.apache.org/ooo/show_bug.cgi?id=126927
that can be generalized as "warn the user if there are problems in the
input". I would like very much to have such a system in place, for
example with ``warnings'' recorded during the XML parsing, and
eventually displayed.

Users would become worried about the unexpected warning messages
emitted during load and save operations, and would come back to us (on
user forums, mailing lists, BugZilla etc); this could help them (and
us) better address issues, maybe even identify problems before they
cause data loss.

If you really do not like the "lenient" parsing, we could try to make
it an optional second attempt. If the "strict" parsing fails, then the
user could be offered to retry with the lenient algorithm, but being
warned about possible security implications.

I hope I could explain myself clearly. I believe that these topics may
have already been discussed in the past; if you want to avoid
discussing any already consolidated decisions, please send me the
pointers.

Best regards,
-- 
Arrigo

http://rigo.altervista.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to