https://bugs.documentfoundation.org/show_bug.cgi?id=154799
Bug ID: 154799
Summary: The ODF partitioning of scripts/languages into
"Western", "RTL + CTL" and "Asian" is invalid
Product: LibreOffice
Version: 7.5.1.2 release
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: medium
Component: LibreOffice
Assignee: [email protected]
Reporter: [email protected]
As we all know languages are not scripts (see also bug 154793); and some
languages may be written with either a "Western" script or one which would be
considered RTL/CTL. Example: Turkish.
So, let's focus on scripts for a moment. What makes a script have "complex text
layout"?
Well, I couldn't quite find an answer in the ODF spec (see here for example:
https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part3-schema/OpenDocument-v1.3-os-part3-schema.html#property-style_script-type
). Following Frank Oberle's text (linked to in bug 92655), I looked at the
Wikipedia definition:
"Complex text layout (CTL) or complex text rendering is the typesetting of
writing systems in which the shape or positioning of a grapheme depends on its
relation to other graphemes. The term is used in the field of software
internationalization, where each grapheme is a character."
Well, cursive Latin script is certainly like that. Actually, even non-cursive
German is like that, due to digraphs: consecutive s's constituting an ß ,
Serbo-Croatian lj and so on. Yes, these are not very _common_, but having them
means one must assume they can occur, making the layout "complex". Also, in
Greek, you have intra-word sigma, σ, and final-form sigma, ς.
On the other hand, if you consider non-cursive Hebrew - there are only a few
letters which have a special forms: 5 out of 26 are like the Greek sigma - a
regular form and a final form. The rest only have one form. So why is Hebrew in
a separate category from Greek?
One could argue "well, Hebrew script is written from right-to-left" - but then,
I'll call Euro-centric bias. Why is right-to-left "complex" and left-to-right
"simple"? Because somebody wrote LTR-only implementations before thinking of
RTL? Surely that's not a valid reason.
So, perhaps one could claim that Hebrew is not complex (CTL), it's just RTL.
But that begs another question: Why group CTL with RTL scripts? Again, it seems
like the rationale is basically "scripts we thought about later so we put them
in another box". That doesn't fly.
Another argument might be: "This is what Microsoft Office does" - and
historically, perhaps that's how this made its way into StarOffice/OpenOffice
and the ODF. But - MSO makes many choices, some right and some wrong; this one
is wrong.
As for Asian languages - what sets them apart? If it's mostly/solely the
possibility of writing in vertical direction - Latin actually has that (see bug
154756 and links therein). Is there something else justifying their being a
separate group?
---
Note: Bug 42123 also made this claim, but more in the context of requests such
as bug 151215, plus the argument that "complexity" is subjective.
--
You are receiving this mail because:
You are the assignee for the bug.