https://bugs.freedesktop.org/show_bug.cgi?id=55292

--- Comment #49 from Owen Genat <[email protected]> ---
(In reply to comment #48)
> after many tests is clear that what causes the autocorrect problem is that
> we have troubles to handle at the same time autocorrect entries with "--"
> and "---" 
> 
> this is exactly Bug 67364 - FORMATTING: Autocorrect no longer functions
> correctly when replacing two hyphens if also an entry with three hyphens
> exists

I initially found this somewhat disheartening but I now think it may lead us to
a better solution. There is going to have to be a compromise, as you indicate
but I think you did the right thing in closing bug 67364. IMO it is the same
issue.

> anyway I have an idea... why we don't stop "fighting" with this "---" corner
> case and start using a unique key combination for em-dash?
> 
> my proposal would be to have:
> 
> .*--.*  for en-dash
> .*__.*  for em-dash

I agree we need a different pattern, but use of low line (U+005F) is NOT a good
idea because this character is used extensively in basic text forms e.g.,
"Name: __________". We do not want these converting to em-dashes. I have tested
this and I end up with the first pair of low lines being converted e.g.
"—_____" in similar manner to the 3-hyphen problem. This is where the fun with
characters begins.

The main argument for "--" to en-dash and "---" to em-dash comes from TeX and
some wiki notation but there really is no consensus about this and I think
others need to understand this. Some publishers, universities, and wikis use
"--" for em-dash. A 2-character pattern is more restrictive. These would seem
to be a simple compromise:

--- for en-dash
=== for em-dash

... but they too have problems. The use of both a different character for each
and a 3-char AutoCorrect rule for consistency gives greater options in avoiding
conflicts with mathematical notation such as == or ++ etc (=== is however used
in JavaScript ... <boo hiss>). Unfortunately these both likely have a potential
to conflict with the AutoCorrect for types of horizontal rule:

https://help.libreoffice.org/Common/Drawing_Lines_in_Text#Automatic_lines_in_Writer

Another compromise is to use mixed character combinations:

-_- for en-dash
-+- for em-dash

These avoid conflict with the AutoCorrect for types of horizontal rule, but
have no precedent in use case. I feel we may need to end up using HTML notation
as a compromise:

&ndash for en-dash
&mdash for em-dash

Not necessarily pretty or as easy (when typing) but the result after
autocorrection is the same and there is GOOD precedent for this type of
AutoCorrect rule. Like the low line (U+005F) example it also only requires two
rules for effective conversion in all use cases. It is also relatively easy to
remember.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to