Re: [NTG-context] [XeTeX] XeTeX, ConTeXt, and utf-8 hyphenation patterns. / GREEK

2006-06-15 Thread Hans Hagen
Peter Heslin wrote:
 Hans Hagen [EMAIL PROTECTED] writes:

   
 ctxtools --pat [en nl agr ...]
 ctxtools --pat --utf[en nl agr ...]

 the greek conversions were done with the help of a greek language users 
 on the context list, so in case of troubles, so i cc there; bugs need to 
 be fixed indeed
 

 Thanks for the tips.  I have taken a closer look at the Greek patterns,
 and it seems as though they have not only small problems, but also major
 problems.  (They will fail to find most hyphenation points before
 accented vowels.)  I will try to come up with a patch, but I don't know
 any Ruby, so it will be an interesting challenge -- the changes required
 go beyond tweaking the existing code.

 The characters in the file lang-agr.pat are precomposed, Unicode
 normalization form D.  But I'd like to support both normalization forms
 C and D, if possible, in the same pattern file.  Is that goal compatible
 with Context?

   
this is more related to (xe)tex than to context; i leave that to the 
greek experts on the context list

Hans

-- 

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] [XeTeX] XeTeX, ConTeXt, and utf-8 hyphenation patterns.

2006-06-13 Thread Jonathan Kew
On 13 Jun 2006, at 8:25 am, Hans Hagen wrote:

 On a more general level, if both ConTeXt and XeTeX are engaged in
 converting legacy TeX hyphenation patterns to utf-8, should they be
 coordinated in order to avoid duplication of effort?


 anyone can use the patterns; of course bugs need to be sorted out, but
 given my experiences with pattern maintainance i will not drop them  
 from
 context; too much has gone wrong in the past; but you can consider  
 them
 to be generic so indeed we can avoid duplication of work.

Indeed I have no desire to duplicate work. :)

My main concern at this point relates to packaging and co-ordination  
between the different macro packages that load patterns; we can't  
expect latex users to be dependent on having context installed, or  
vice versa. Patterns belong in a base tex installation, where they  
can be available to any higher-level macro package.

This needs to be sorted out among a wider group than this mailing  
list

JK

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] [XeTeX] XeTeX, ConTeXt, and utf-8 hyphenation patterns.

2006-06-13 Thread Hans Hagen
Peter Heslin wrote:
 A little while ago, I said that I hoped to convert Dimitrios Filippou's
 ancient Greek hyphenation patterns (the elhyphen package) to utf-8, in
 order to use them with xetex.  Before thinking about starting this work,
 I decided to look to see if anyone else had done it, and I came across
 something interesting in ConTeXt, which is not a package I normally use.

 There appears to be a whole subdirectory in the ConTeXt distribution
 that is full of utf-8 hyphenation patterns, including Filippou's ancient
 Greek ones, but also including German, French, etc.  They are in the
 file: http://www.pragma-ade.com/context/current/cont-tmf.zip, in the
 tex/context/patterns directory.

 Can anyone who knows about ConTeXt explain about where these patterns
 come from and how it is that context manages to use these patterns?  (I
 thought that non-xetex TeX could only use single-byte encoded patterns.)
   
some time ago i decided to ship patterns with context because

(1) there is no sound infrastructure in the tex world for managin gpatterns
(2) i need encoding neutral patterns [most patterns are ec only]
(3) i want control over what gets loaded in context
(4) i wanted to get rid of every year's disappearing, renamed, changed 
patterns
(5) apart from the fact that i wanted patterns that were not in a sense 
hard wired latex patterns
 If there is a script that was used to convert these from the source to
 utf-8, is it available?  A quick glance at the ancient greek patterns
 (in the file lang-agr.pat) shows that there is a bug in the conversion
 that I'd like to report and fix.
   
ctxtools --pat [en nl agr ...]
ctxtools --pat --utf[en nl agr ...]

the greek conversions were done with the help of a greek language users 
on the context list, so in case of troubles, so i cc there; bugs need to 
be fixed indeed

in ctxtools.rb you can grep for 'agr' and see what conversions takes 
place for greek

more info can be found in:

http://www.pragma-ade.com/general/manuals/mpattern.pdf

(also published in tugboat)

there is a file lang-all.xml in the context distribution

 On a more general level, if both ConTeXt and XeTeX are engaged in
 converting legacy TeX hyphenation patterns to utf-8, should they be
 coordinated in order to avoid duplication of effort?

   
anyone can use the patterns; of course bugs need to be sorted out, but 
given my experiences with pattern maintainance i will not drop them from 
context; too much has gone wrong in the past; but you can consider them 
to be generic so indeed we can avoid duplication of work.

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] [XeTeX] XeTeX, ConTeXt, and utf-8 hyphenation patterns.

2006-06-13 Thread Hans Hagen
Jonathan Kew wrote:
 On 13 Jun 2006, at 8:25 am, Hans Hagen wrote:

   
 On a more general level, if both ConTeXt and XeTeX are engaged in
 converting legacy TeX hyphenation patterns to utf-8, should they be
 coordinated in order to avoid duplication of effort?


   
 anyone can use the patterns; of course bugs need to be sorted out, but
 given my experiences with pattern maintainance i will not drop them  
 from
 context; too much has gone wrong in the past; but you can consider  
 them
 to be generic so indeed we can avoid duplication of work.
 

 Indeed I have no desire to duplicate work. :)

 My main concern at this point relates to packaging and co-ordination  
 between the different macro packages that load patterns; we can't  
 expect latex users to be dependent on having context installed, or  
 vice versa. Patterns belong in a base tex installation, where they  
 can be available to any higher-level macro package.
   
well, the problem is that until now, most pattern files were basically 
latex oriented files; the same is kind of true with fonts: changes in 
related files and names take place, and are synced with latex and then 
bites contex users; i've kind of given up on that
 This needs to be sorted out among a wider group than this mailing  
 list
   
well, installing the lang-* pat files only is an option, as is adding 
tex/context/patterns to the xetex input path in the xetex input path 
variable (although i believe that the tree is searched anyway);

also, given what people have to install nowadays, installing the context 
ipackage is not that big a burden (xetex binaries and associated libs 
are pretty big themselves anyway)

Hans

-- 

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context