https://bugs.documentfoundation.org/show_bug.cgi?id=59448
--- Comment #7 from [email protected] --- Here is a high level description of a proposed change for 5.2 to handle Khmer linebreaking. The new algorithm is as follows: The insert of a ZWSP (U+200B) introduces a line break opportunity directly following the ZWSP but it also inhibits any dictionary based breaks up to 3 clusters before and after the ZWSP. A cluster is defined as a base + medials. A medial is any general category M character and also a coeng+base sequence. Likewise the insertion of a WJ (U+2060) inhibits a line break opportunity at that point and up to 3 clusters before and after. For normal string boundaries (change of script, spaces, etc.) there is a potential 3 cluster inhibition before and after the boundary. But the inhibiting behaviour is only sustained if there is another boundary before or after the boundary such that it's potential or actual inhibition overlaps the 3 cluster potential boundary. So, for example, if there is a space followed by 5 clusters then another space, the two potential inhibition ranges overlap and the inhibition becomes actual and there will be no dictionary breaks in that run. But if the string extends to be 6 clusters, then the ranges don't overlap and they collapse and dictionary breaking can occur anywhere in the string. A further change is that there is a class of characters (etc., repeat) which should never break following a word. Such breaks are inhibited. This change is being implemented as a patch against the ICU library in libo, where we can test it and play with it with real data and real projects. If there is a user consensus that this works well, we will propose it as a change to ICU. The change has been written to be generic so that it could, potentially, be used for other scripts should that be found to be beneficial. While we are at it, there are plans to review the Khmer breaking dictionary. Watch this space for a patch/gerrit commit so that you can go and play. -- You are receiving this mail because: You are the assignee for the bug.
_______________________________________________ Libreoffice-bugs mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs
