[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-11-10 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

himajin100...@gmail.com changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=14
   ||4093

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-11-04 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Kevin Suo  changed:

   What|Removed |Added

 CC||regis.perdr...@gmail.com

--- Comment #17 from Kevin Suo  ---
*** Bug 141672 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-11-03 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Kevin Suo  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=94
   ||982

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-11-03 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Kevin Suo  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #16 from Kevin Suo  ---
Resolvedin LibreOffice master via Kohei's upgrade of lo orcus version to 0.17.0
in commit eb07a0e76.

Mark as RESOLVED FIXED.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-11-02 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Kevin Suo  changed:

   What|Removed |Added

 Blocks||145509


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=145509
[Bug 145509] [META] Bugs Related to Liborcus
-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-11-02 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #15 from Kevin Suo  ---
https://gerrit.libreoffice.org/c/core/+/124573

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-25 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #14 from Kevin Suo  ---
This issue has been fixed upstream in Orcus
https://gitlab.com/orcus/orcus/-/issues/143

However the current orcus version is still old. Need either prepare a patch
within libreoffice, or upgrade orcus to the pending 0.17.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-23 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Kevin Suo  changed:

   What|Removed |Added

   See Also||https://gitlab.com/orcus/or
   ||cus/-/issues/143

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-19 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #13 from Kevin Suo  ---
And the old patch adding utf-8 support seems only addressed the names like the
following:


   4


in which the 1 char is still an ascii [a-zA-Z] alpha (M, N). That is why the
test in that patch can pass.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-19 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #12 from Kevin Suo  ---
The problem may be in:
include/orcus/sax_parser.hpp

template
void sax_parser<_Handler,_Config>::element()
{
assert(cur_char() == '<');
std::ptrdiff_t pos = offset();
char c = next_char_checked();
switch (c)
{
case '/':
element_close(pos);
break;
case '!':
special_tag();
break;
case '?':
declaration(nullptr);
break;
default:
if (!is_alpha(c) && c != '_')
throw sax::malformed_xml_error("expected an alphabet.",
offset());
element_open(pos);
}
}

The default clause checks whether the current char is alpha. However, for
complex char tags i.e. CJK, this is not true as the char may be a a portion of
a multi-byte char stream. In my testing the value of such c is < 0. Im such
case, it should continue reading until it finds the closing tag ">".

See my patch for the other bug at
https://gerrit.libreoffice.org/c/core/+/123727

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #11 from himajin100...@gmail.com ---
>So this issue may be fixed after we upgrade orcus to the 

When I posted comment 4, I thought so, but I was wrong.

LibreOffice master has a patch already merged on April
https://gerrit.libreoffice.org/c/core/+/114892
, which is similar to the upstream so-called fix.
https://gitlab.com/orcus/orcus/-/commit/2c2215e94bd8fce4b9a93e986339aa6ae06d2cba

so I thought the bug was supposed to be fixed. I tested on latest master, and
unfortunately the bug was still reproducible.

I continued investigation on my own, and finally found the culprit as indicated
on comment 9.

--
sax_parser<_Handler,_Config>::element()
https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L225
calls

sax_parser<_Handler,_Config>::element_open(std::ptrdiff_t begin_pos)
at
https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L245
IF NO EXCEPTION IS THROWN,

https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L255

and then calls 
parser_base::element_name(parser_element& elem, std::ptrdiff_t begin_pos)
https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L255
https://gitlab.com/orcus/orcus/-/blob/master/src/parser/sax_parser_base.cpp#L394
,which calls

parser_base::name(std::string_view& str)
https://gitlab.com/orcus/orcus/-/blob/master/src/parser/sax_parser_base.cpp#L333
--

the patch mainly focuses on parser_base::name(std::string_view& str),
but the culprit is even before that. THE EXCEPTION WAS THORWN.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #10 from Kevin Suo  ---

>orcus 
>* sax parser
>* utf-8 names are now allowed as element names.

So this issue may be fixed after we upgrade orcus to the ?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Kevin Suo  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=96
   ||499

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #9 from himajin100...@gmail.com ---
https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L244
https://gitlab.com/orcus/orcus/-/blob/master/include/orcus/sax_parser.hpp#L252

there may be more is_alpha() thingy in non-element-related code.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #8 from himajin100...@gmail.com ---
Created attachment 175793
  --> https://bugs.documentfoundation.org/attachment.cgi?id=175793=edit
NonReproducible XML file. Names start with ascii-alpha

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-17 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #7 from himajin100...@gmail.com ---
Created attachment 175792
  --> https://bugs.documentfoundation.org/attachment.cgi?id=175792=edit
Reproducible XML file. Names start with (non-(ascii-alpha))

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-14 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Roman Kuznetsov <79045_79...@mail.ru> changed:

   What|Removed |Added

 CC||79045_79...@mail.ru
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #6 from Roman Kuznetsov <79045_79...@mail.ru> ---
Confirm in

Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 17d3cacfb9675268e709cfc95771ad4ce8bde75a
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render:
Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: CL

So will hope an orcus library updating can solve it =)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-14 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

Michael Warner  changed:

   What|Removed |Added

   See Also||https://bugs.documentfounda
   ||tion.org/show_bug.cgi?id=14
   ||1672

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-14 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #5 from himajin100...@gmail.com ---
slightly different error message.

warn:sc.orcus:13376:15364:sc/source/filter/orcus/xmlcontext.cxx:191: Malformed
XML error: malformed_xml_error: expected an alphabet. (offset=39)

Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 5b2848413883565c48d312c96daf8fbca25405d8
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: default; VCL: win
Locale: ja-JP (ja_JP); UI: en-US
Calc: CL

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-14 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

himajin100...@gmail.com changed:

   What|Removed |Added

 CC||ko...@libreoffice.org

--- Comment #4 from himajin100...@gmail.com ---
https://opengrok.libreoffice.org/xref/core/sc/source/ui/xmlsource/xmlsourcedlg.cxx?r=3b8e53f6#191

possible cause:

https://gitlab.com/orcus/orcus/-/blob/master/CHANGELOG

>orcus 
>* sax parser
>* utf-8 names are now allowed as element names.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-13 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #3 from Anton Shevtsov  ---
Tried 7.0.6.2 from my distro repo  and 7.2.1.2 from official site
Problem still exists. XML not imported (button Import is disable)

Version: 7.2.1.2 / LibreOffice Community
Build ID: 87b77fad49947c1441b67c559c339af8f3517e22
CPU threads: 12; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: ru-RU (ru_RU.UTF-8); UI: en-US
Calc: threaded

Version: 7.0.6.2
Build ID: 00(Build:2)
CPU threads: 12; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: ru-RU (ru_RU.UTF-8); ИП: ru-RU
Calc: threaded

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-13 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #2 from Michael Warner  ---
I recall another bug about this being written about 5 months back. I can't
remember enough details at the moment to find it, but it was traced to a
dependent library and was fixed. 

You don't say what 7.x version you tried, but please try downloading the latest
version from https://www.libreoffice.org/download/libreoffice-fresh/ and see if
the problem is still there.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 145117] XML Source not imported if tags is non-ASCII symbols

2021-10-13 Thread bugzilla-daemon
https://bugs.documentfoundation.org/show_bug.cgi?id=145117

--- Comment #1 from Anton Shevtsov  ---
Created attachment 175720
  --> https://bugs.documentfoundation.org/attachment.cgi?id=175720=edit
xml example file

-- 
You are receiving this mail because:
You are the assignee for the bug.