dear Jonathan, in 1995 I organized a conference at Technion University in Haifa, and I had a famous linguist and member of the Hebrew Academy, Uzzi Ornan (he is still alive and well, at the age of 97!), who gave a presentation on the hyphenation of Hebrew.
When I saw your message I placed the video of the presentation on Youtube: https://youtu.be/VmEHaS_f_eE Basically the problem with Hebrew is that sometimes it is used in abjad mode and sometimes in phonographic mode. When in abjad mode, the eye has to see the whole word to analyze it morphologically and breaking it can hinder reading alot. When in phonographic mode (for example for foreign words) hyphenation is possible without slowing down reading too much. The problem is of course that hyphenation patterns apply everywhere and it is hard to distinguish abjad from phonographic words. It is not impossible, but difficult, and requires a large hyphenated corpus. Are you ready to do some work? For example, if I supply you with the list of Hebrew Wiktionary entries (there are ~ 22 thousand of them) are you willing to mark those that are abjad with an asterisk and give hyphenation for those that are phonographic? Examples: פּוֹלִיפוֹנִיָה is phonographic, it is a Greek word and it can be hyphenated as פּוֹ-לִי-פוֹ-נִיָה סֵפֶר is abjad, it should not be hyphenated Let me know what you are planning/willing to do Cheers Yannis > Le 28 févr. 2021 à 10:52, Yonatan Zilpa <[email protected]> a écrit : > > Dear Arthur, > Thanks a lot for your willingness to help. Hebrew is an RTL (Right to Left) > language. Thus the default English hyphenation pattern doesn't work in > Hebrew. First I would like to adjust the lines in such a way that long words > would be split between lines automatically. Second I would like to write a > hyphenation pattern file for Hebrew language to hyphenate well known > hyphenated words. > > Kind regards, > Jonathan Zilpa > > בתאריך שבת, 27 בפבר׳ 2021 ב-23:09 מאת Arthur Rosendahl > <[email protected] > <mailto:[email protected]>>: > Dear Jonatan Zilpa, > > On Sun, Jan 31, 2021 at 07:08:48PM +0200, Yonatan Zilpa wrote: > > Dear Mojca Miklavec / Arthur Reutenauer, > > I would like to write a hyphenation pattern for Hebrew for XeLaTex / > > Polyglossia. > > May you please help me by providing guidance on how to do this. > > We can help you with that, but note that Mojca and I are only > responsible for distributing the patterns, we may not be the most > knowledgeable ones about hyphenation in one particular language. I do, > however, know that Hebrew is not normally hyphenated, so can you > give a little more details on what you’d like to do? > > Best, > > Arthur Rosendahl (né Reutenauer) <http://www.imt-atlantique.fr/> Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC <http://perso.telecom-bretagne.eu/yannisharalambous/> <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT <http://www.imt.fr/> We're going always. — 'We're going always.' — Totally. — That's not actually a sentence. — Well it's got a verb in it. (Doctor Who)
