Hi guys, thanks for the rich comments! Guessed right it's about Sanskrit but it's too early to present something (Dominik already knows).
The other guy who asked me if I could find out something on running custom Unicode patterns also with Pdflatex/ucs.sty/Babel is about to switch over to Xelatex/Polyglossia now, so this isn't the important issue anymore - but thanks again for the comments anyway. Greetings, Daniel Stender On 12.06.2011 23:27, Mojca Miklavec wrote: > I'm not sure if you subscribed to the list or not. Arthur forgot to CC > you in case you aren't. > > Mojca > > ---------- Forwarded message ---------- > From: Arthur Reutenauer <[email protected]> > Date: Sun, Jun 12, 2011 at 20:38 > Subject: Re: [tex-hyphen] Unicode patterns for Unicode in Pdflatex/Babel? > To: "About TeX hyphenation patterns." <[email protected]> > Cc: "[email protected]" <[email protected]> > > Hi Daniel, > >> I am trying to make myself clear: I would like to know if it wouldn't be >> possible to employ a custom >> Unicode hyphenation rules/pattern file also for Pdflatex/Babel when the text >> there is Unicode, too > > That shouldn't be necessary: the patterns that are presented to > pdfTeX are encoded in some *font* encoding, distinct from the input > encoding that you use in your document. The inputenc and fontenc > packages take care of mapping the code positions from the input > encoding to the font encoding, be it UTF-8 or an 8-bit encoding. As > far as the patterns are concerned, they're always encoded in UTF-8 in > the different hyph-<lang>.tex, and are converted on the fly when input > by pdfTeX, at format generation time, to whatever font encoding is > appropriate for the language at hand. The inputenc / fontenc packages > then do the job for you, and you can use any encoding you wish in your > document. > > However, I'm going to venture a wild guess and assume that the > language you're interested in is Sanskrit, a language which actually > has patterns disabled for pdfTeX, because we couldn't determine what > font encoding was appropriate when the patterns were submitted: for > the vast majority of languages that had patterns when Mojca and I took > over work on hyphenation files three years ago, there was one single > 8-bit encoding, that was used by both the pattern file and the Babel > support files. Several languages, though, have been added in the mean > time, including Sanskrit, that had no dedicated 8-bit encoding that we > could use(*). We thus decided to make them available for > Unicode-aware TeX engines only; hence, you don't have access to them > from pdfTeX. But if you have a reason to want to use them, we'll > gladly make them available as well. That won't be a problem at all; > we only never considered the issue because we didn't think it would > come up -- Mojca, what do you think? > > Arthur > > (*) Note that packages to typeset Devanagari in TeX, as well as > several other Indic scripts, have existed for a long time, but they > didn't have any hyphenation patterns attached. These have only be > added recently from different contributors, and when Mojca found out > that OpenOffice shipped many pattern files for modern Indian > languages. All the files were encoded using UTF-8. >> > ---------- Forwarded message ---------- > From: Mojca Miklavec <[email protected]> > Date: Sun, Jun 12, 2011 at 23:25 > Subject: Re: [tex-hyphen] Unicode patterns for Unicode in Pdflatex/Babel? > To: "About TeX hyphenation patterns." <[email protected]> > > As Arthur pointed out, you cannot simply load hyph-foo.tex file with > UTF-8 patterns without loading other macros to handle UTF-8 *before* > patterns themselves. (You may try and see what happens.) > > But we tried to make sure that all patterns that can reasonably work > in pdfTeX do work. Indic languages, Sanskrit, "Ethiopic", Lao etc. > were disabled in pdfTeX because we had no idea how to handle them. > > I would really like to know what language Daniel is referring to > before any further discussion. > > If it is really about Sanskrit ... I have no idea how one can type it > in UTF-8 in pdfTeX. All the examples about Devanagari that I found > were using ASCII and were pointing to Velthuis' package claiming: > > % Hyphenation > % ~~~~~~~~~~~ > % The responsibility for hyphenating Devanagari text is taken over > % completely by the preprocessor, devnag.c. The preprocessor inserts > % discretionary hyphenation points (\-) in all the places it thinks are > % appropriate. > > or whatever other older package. > > I would need more input to provide any reasonable answer. > > Mojca -- http://www.danielstender.com/granthinam/ GPG key ID: 1654BD9C
