Re: [tex-hyphen] Unicode patterns for Unicode in Pdflatex/Babel?

Daniel Stender Fri, 17 Jun 2011 03:33:31 -0700

Hi guys,

thanks for the rich comments! Guessed right it's about Sanskrit but it's too 
early to present
something (Dominik already knows).


The other guy who asked me if I could find out something on running custom 
Unicode patterns also
with Pdflatex/ucs.sty/Babel is about to switch over to Xelatex/Polyglossia now, 
so this isn't the
important issue anymore - but thanks again for the comments anyway.

Greetings,
Daniel Stender

On 12.06.2011 23:27, Mojca Miklavec wrote:
> I'm not sure if you subscribed to the list or not. Arthur forgot to CC
> you in case you aren't.
> 
> Mojca
> 
> ---------- Forwarded message ----------
> From: Arthur Reutenauer <[email protected]>
> Date: Sun, Jun 12, 2011 at 20:38
> Subject: Re: [tex-hyphen] Unicode patterns for Unicode in Pdflatex/Babel?
> To: "About TeX hyphenation patterns." <[email protected]>
> Cc: "[email protected]" <[email protected]>
> 
>  Hi Daniel,
> 
>> I am trying to make myself clear: I would like to know if it wouldn't be 
>> possible to employ a custom
>> Unicode hyphenation rules/pattern file also for Pdflatex/Babel when the text 
>> there is Unicode, too
> 
>  That shouldn't be necessary: the patterns that are presented to
> pdfTeX are encoded in some *font* encoding, distinct from the input
> encoding that you use in your document.  The inputenc and fontenc
> packages take care of mapping the code positions from the input
> encoding to the font encoding, be it UTF-8 or an 8-bit encoding.  As
> far as the patterns are concerned, they're always encoded in UTF-8 in
> the different hyph-<lang>.tex, and are converted on the fly when input
> by pdfTeX, at format generation time, to whatever font encoding is
> appropriate for the language at hand.  The inputenc / fontenc packages
> then do the job for you, and you can use any encoding you wish in your
> document.
> 
> However, I'm going to venture a wild guess and assume that the
> language you're interested in is Sanskrit, a language which actually
> has patterns disabled for pdfTeX, because we couldn't determine what
> font encoding was appropriate when the patterns were submitted: for
> the vast majority of languages that had patterns when Mojca and I took
> over work on hyphenation files three years ago, there was one single
> 8-bit encoding, that was used by both the pattern file and the Babel
> support files.  Several languages, though, have been added in the mean
> time, including Sanskrit, that had no dedicated 8-bit encoding that we
> could use(*).  We thus decided to make them available for
> Unicode-aware TeX engines only; hence, you don't have access to them
> from pdfTeX.  But if you have a reason to want to use them, we'll
> gladly make them available as well.  That won't be a problem at all;
> we only never considered the issue because we didn't think it would
> come up -- Mojca, what do you think?
> 
> Arthur
> 
> (*) Note that packages to typeset Devanagari in TeX, as well as
> several other Indic scripts, have existed for a long time, but they
> didn't have any hyphenation patterns attached.  These have only be
> added recently from different contributors, and when Mojca found out
> that OpenOffice shipped many pattern files for modern Indian
> languages.  All the files were encoded using UTF-8.
>>
> ---------- Forwarded message ----------
> From: Mojca Miklavec <[email protected]>
> Date: Sun, Jun 12, 2011 at 23:25
> Subject: Re: [tex-hyphen] Unicode patterns for Unicode in Pdflatex/Babel?
> To: "About TeX hyphenation patterns." <[email protected]>
> 
> As Arthur pointed out, you cannot simply load hyph-foo.tex file with
> UTF-8 patterns without loading other macros to handle UTF-8 *before*
> patterns themselves. (You may try and see what happens.)
> 
> But we tried to make sure that all patterns that can reasonably work
> in pdfTeX do work. Indic languages, Sanskrit, "Ethiopic", Lao etc.
> were disabled in pdfTeX because we had no idea how to handle them.
> 
> I would really like to know what language Daniel is referring to
> before any further discussion.
> 
> If it is really about Sanskrit ... I have no idea how one can type it
> in UTF-8 in pdfTeX. All the examples about Devanagari that I found
> were using ASCII and were pointing to Velthuis' package claiming:
> 
> % Hyphenation
> % ~~~~~~~~~~~
> % The responsibility for hyphenating Devanagari text is taken over
> % completely by the preprocessor, devnag.c. The preprocessor inserts
> % discretionary hyphenation points (\-) in all the places it thinks are
> % appropriate.
> 
> or whatever other older package.
> 
> I would need more input to provide any reasonable answer.
> 
> Mojca

-- 
http://www.danielstender.com/granthinam/
GPG key ID: 1654BD9C

Re: [tex-hyphen] Unicode patterns for Unicode in Pdflatex/Babel?

Reply via email to