https://bugs.documentfoundation.org/show_bug.cgi?id=167871
Tex2002ans <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |Tex2002ans+LibreOffice@gmai | |l.com --- Comment #4 from Tex2002ans <[email protected]> --- Created attachment 202907 --> https://bugs.documentfoundation.org/attachment.cgi?id=202907&action=edit Find.and.Replace.-.Regular.Expression.-.SOFT.HYPHEN.-.U+00AD.odt I attached a sample document with 3 examples: - no hyphen - HYPHEN-MINUS - SOFT HYPHEN --- Comment 0's bug definitely happens in 3rd sentence! - - - I confirm this happens in: Version: 25.8.1.1 (X86_64) Build ID: 54047653041915e595ad4e45cccea684809c77b5 CPU threads: 8; OS: Windows 11 X86_64 (build 22631); UI render: Skia/Vulkan; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded ... but it looks like the: - SOFT HYPHEN (U+00AD) is handled strangely. Currently, LO can "find" the text, but does not seem to "capture" it into a regex Group. But I think the root cause of this bug is... If "Regular Expression" mode is ON: - `[:alpha:]` SHOULD NOT match the SOFT HYPHEN character. - `[:alpha:]` SHOULD ONLY match alphabetic characters. - SOFT HYPHEN should be treated as a... --- "punctuation mark", roughly equivalent to "a HYPHEN" (U+002D)! - - - STEPS TO REPRODUCE 0. Open attached document. 1. Edit > Find and Replace (Ctrl+H). 2. Expand "Other Options", then make sure these 2 checkboxes are ON: - Regular Expressions - Diacritic-sensitive 3. In the 2 boxes, type: - Find: \b"([:alpha:]+)"\b - Replace: „$1“ 4. Press the "Replace All" button. ACTUAL After pressing "Find All" and/or "Replace All": - 2 hits --- 1st line turned into „antäuschen“ --- 3rd line turned into „$1“ ----- = BUG EXPECTED After pressing "Find All" and/or "Replace All": - 1 hit --- 1st line turned into „antäuschen“ - - - NOTES on Comment 3: Hmmm... very strange. I can get the SOFT HYPHEN to match with a period. For example, do Step 3: - Find: \b"(.+?)"\b - Replace: „$1“ and this will retain the SOFT HYPHEN + any inner text, while flipping the quotes. But if you do this: - Find: \b"an. - Replace: „ZZZ LO will act like the SOFT HYPHEN isn't even there and match/replace both: - "ant - 4 characters - "an-t - 5 characters - '-' = invisible SOFT HYPHEN position Hmmmm... so something weird is definitely going on with the SOFT HYPHEN and regex. It could be because SOFT HYPHEN is a weirdly unique character, acting as "punctuation" AND "a format code" AND is "invisible" at the same time. -- You are receiving this mail because: You are the assignee for the bug.
