[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 himajin100...@gmail.com changed: What|Removed |Added See Also||https://bugs.documentfounda ||tion.org/show_bug.cgi?id=14 ||4093 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Kevin Suo changed: What|Removed |Added CC||regis.perdr...@gmail.com --- Comment #17 from Kevin Suo --- *** Bug 141672 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Kevin Suo changed: What|Removed |Added See Also||https://bugs.documentfounda ||tion.org/show_bug.cgi?id=94 ||982 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Kevin Suo changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #16 from Kevin Suo --- Resolvedin LibreOffice master via Kohei's upgrade of lo orcus version to 0.17.0 in commit eb07a0e76. Mark as RESOLVED FIXED. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Kevin Suo changed: What|Removed |Added Blocks||145509 Referenced Bugs: https://bugs.documentfoundation.org/show_bug.cgi?id=145509 [Bug 145509] [META] Bugs Related to Liborcus -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #15 from Kevin Suo --- https://gerrit.libreoffice.org/c/core/+/124573 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #14 from Kevin Suo --- This issue has been fixed upstream in Orcus https://gitlab.com/orcus/orcus/-/issues/143 However the current orcus version is still old. Need either prepare a patch within libreoffice, or upgrade orcus to the pending 0.17. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Kevin Suo changed: What|Removed |Added See Also||https://gitlab.com/orcus/or ||cus/-/issues/143 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #13 from Kevin Suo --- And the old patch adding utf-8 support seems only addressed the names like the following: 4 in which the 1 char is still an ascii [a-zA-Z] alpha (M, N). That is why the test in that patch can pass. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #12 from Kevin Suo --- The problem may be in: include/orcus/sax_parser.hpp template void sax_parser<_Handler,_Config>::element() { assert(cur_char() == '<'); std::ptrdiff_t pos = offset(); char c = next_char_checked(); switch (c) { case '/': element_close(pos); break; case '!': special_tag(); break; case '?': declaration(nullptr); break; default: if (!is_alpha(c) && c != '_') throw sax::malformed_xml_error("expected an alphabet.", offset()); element_open(pos); } } The default clause checks whether the current char is alpha. However, for complex char tags i.e. CJK, this is not true as the char may be a a portion of a multi-byte char stream. In my testing the value of such c is < 0. Im such case, it should continue reading until it finds the closing tag ">". See my patch for the other bug at https://gerrit.libreoffice.org/c/core/+/123727 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #11 from himajin100...@gmail.com --- >So this issue may be fixed after we upgrade orcus to the When I posted comment 4, I thought so, but I was wrong. LibreOffice master has a patch already merged on April https://gerrit.libreoffice.org/c/core/+/114892 , which is similar to the upstream so-called fix. https://gitlab.com/orcus/orcus/-/commit/2c2215e94bd8fce4b9a93e986339aa6ae06d2cba so I thought the bug was supposed to be fixed. I tested on latest master, and unfortunately the bug was still reproducible. I continued investigation on my own, and finally found the culprit as indicated on comment 9. -- sax_parser<_Handler,_Config>::element() https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L225 calls sax_parser<_Handler,_Config>::element_open(std::ptrdiff_t begin_pos) at https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L245 IF NO EXCEPTION IS THROWN, https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L255 and then calls parser_base::element_name(parser_element& elem, std::ptrdiff_t begin_pos) https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L255 https://gitlab.com/orcus/orcus/-/blob/master/src/parser/sax_parser_base.cpp#L394 ,which calls parser_base::name(std::string_view& str) https://gitlab.com/orcus/orcus/-/blob/master/src/parser/sax_parser_base.cpp#L333 -- the patch mainly focuses on parser_base::name(std::string_view& str), but the culprit is even before that. THE EXCEPTION WAS THORWN. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #10 from Kevin Suo --- >orcus >* sax parser >* utf-8 names are now allowed as element names. So this issue may be fixed after we upgrade orcus to the ? -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Kevin Suo changed: What|Removed |Added See Also||https://bugs.documentfounda ||tion.org/show_bug.cgi?id=96 ||499 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #9 from himajin100...@gmail.com --- https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L244 https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L252 there may be more is_alpha() thingy in non-element-related code. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #8 from himajin100...@gmail.com --- Created attachment 175793 --> https://bugs.documentfoundation.org/attachment.cgi?id=175793=edit NonReproducible XML file. Names start with ascii-alpha -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #7 from himajin100...@gmail.com --- Created attachment 175792 --> https://bugs.documentfoundation.org/attachment.cgi?id=175792=edit Reproducible XML file. Names start with (non-(ascii-alpha)) -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Roman Kuznetsov <79045_79...@mail.ru> changed: What|Removed |Added CC||79045_79...@mail.ru Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #6 from Roman Kuznetsov <79045_79...@mail.ru> --- Confirm in Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 17d3cacfb9675268e709cfc95771ad4ce8bde75a CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win Locale: ru-RU (ru_RU); UI: en-US Calc: CL So will hope an orcus library updating can solve it =) -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 Michael Warner changed: What|Removed |Added See Also||https://bugs.documentfounda ||tion.org/show_bug.cgi?id=14 ||1672 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #5 from himajin100...@gmail.com --- slightly different error message. warn:sc.orcus:13376:15364:sc/source/filter/orcus/xmlcontext.cxx:191: Malformed XML error: malformed_xml_error: expected an alphabet. (offset=39) Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 5b2848413883565c48d312c96daf8fbca25405d8 CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: default; VCL: win Locale: ja-JP (ja_JP); UI: en-US Calc: CL -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 himajin100...@gmail.com changed: What|Removed |Added CC||ko...@libreoffice.org --- Comment #4 from himajin100...@gmail.com --- https://opengrok.libreoffice.org/xref/core/sc/source/ui/xmlsource/xmlsourcedlg.cxx?r=3b8e53f6#191 possible cause: https://gitlab.com/orcus/orcus/-/blob/master/CHANGELOG >orcus >* sax parser >* utf-8 names are now allowed as element names. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #3 from Anton Shevtsov --- Tried 7.0.6.2 from my distro repo and 7.2.1.2 from official site Problem still exists. XML not imported (button Import is disable) Version: 7.2.1.2 / LibreOffice Community Build ID: 87b77fad49947c1441b67c559c339af8f3517e22 CPU threads: 12; OS: Linux 5.10; UI render: default; VCL: gtk3 Locale: ru-RU (ru_RU.UTF-8); UI: en-US Calc: threaded Version: 7.0.6.2 Build ID: 00(Build:2) CPU threads: 12; OS: Linux 5.10; UI render: default; VCL: gtk3 Locale: ru-RU (ru_RU.UTF-8); ИП: ru-RU Calc: threaded -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #2 from Michael Warner --- I recall another bug about this being written about 5 months back. I can't remember enough details at the moment to find it, but it was traced to a dependent library and was fixed. You don't say what 7.x version you tried, but please try downloading the latest version from https://www.libreoffice.org/download/libreoffice-fresh/ and see if the problem is still there. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols
https://bugs.documentfoundation.org/show_bug.cgi?id=145117 --- Comment #1 from Anton Shevtsov --- Created attachment 175720 --> https://bugs.documentfoundation.org/attachment.cgi?id=175720=edit xml example file -- You are receiving this mail because: You are the assignee for the bug.