> Here it is 
> http://extensions.services.openoffice.org/en/project/dict-be-official

  Thanks.

> The file itself is in cp1251 and needs conversion to UTF-8
> iconv -f cp1251 -t UTF-8 < ./hyph_be_BY.dic > ./hyph_be_BY.txt
> + some hand editing to put the content inside \patterns{}

  Thanks, I know how to do that :-)

> According to comment on line 1414: intention to include such awkward patterns
> was to prohibit hyphenation if any part that is composed solely of consonants.

  There’s something odd anyway.  I still suspect the actual list of
patterns does not reflect the intention of the author.

> Ok, I'll ask.

  Thanks.  I don’t mind being copied on the conversation, even if it is
in Belarusian.  You should contact Sviatlana Liasovich as well, since
she’s mentioned as having made corrections; in fact I think it would be
accurate to consider her as the sole author of the OpenOffice file,
since I can’t discern any trace of the original patterns.

>>   That’s correct, but actually I would just write
>> 
>>      д2ж
>>      д2з
>>      .пад3
>> 
>>   Using lower numbers to begin with makes it easier to refine later.
>> 
>>   That being said, is пад really always a prefix?
> 
> This would make life too easy :) In some words it is a part of the root and 
> is hyphenated differently.
> E.g.: па-да-ру-нак, па-дзел, вы-па-дак, па-да-плё-ка.

  OK, that’s what I suspected :-)  In that case it’s probably safer to
stick to

        д2ж
        д2з
        .па2д3ж
        .па2д3з

and input падзел as an exception: \hyphenation{па-зел}.

  You need an even number after .па because of patterns of the type CVn,
with n an odd number to allow break; the OpenOffice patterns have C8V3,
but I would recommend CV1.

> Hyphenation right before й or ў is prohibited at all times, no exceptions. So 
> 8 will be just right, I believe.

  That sounds right.  It’s of course all right to use 8 when break is
really prohibited, but the current files use way too much of them.

        Best,

                Arthur

Reply via email to