https://bugs.documentfoundation.org/show_bug.cgi?id=150023

--- Comment #4 from Mike Kaganski <[email protected]> ---
Just a side remark

sax_parser::header starts with

    skip_bom();
    skip_space_and_control();
    if (!has_char() || cur_char() != '<')
        throw sax::malformed_xml_error("xml file must begin with '<'.",
offset());

where skip_space_and_control is documented as "Skip all characters that are
0-32 in ASCII range", and one would assume that after an optional BOM, "space
and control" characters are tolerated. before the opening '<'.

But skip_bom is implemented as checking that if the first UTF-8 code unit is
not ' ' or '<', then the BOM *must* be present, and the '<' *must* immediately
follow it. So the skip_bom implementation prevents the correct
skip_space_and_control execution, and basically that required the explicit
check for space inside it, partially implementing the skip_space_and_control
functionality.

Just checking three BOM characters without any other checks would seem
reasonable to me :)

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to