Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-12-08 Thread Jan Tosovsky
On 2013-12-01 Khaled Hosny wrote:
 On Sun, Dec 01, 2013 at 11:21:30AM +0200, Khaled Hosny wrote:
 
 Interestingly, after I patched Sorts Mill (a FontForge fork) to avoid
 duplicates[1] I ended up with a ‘dotlessi.sc’ glyph, as it turns out
 the font has a dotlessi → regular smallcap i later on, so that is 
 where FontLab gets the glyph name, too.
 
 I’ll try to port this patch to LuaTeX later.

Thanks for handling this! 

When can I expect this fix in luatex.dll updated on my local machine using the 
first-setup script?

Will it be in any following minor 0.77 update or in 0.80 later this year?

Jan

Btw, FontForge could be patched as well ;-)

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-12-08 Thread Khaled Hosny
On Sun, Dec 08, 2013 at 09:19:26PM +0100, Jan Tosovsky wrote:
 On 2013-12-01 Khaled Hosny wrote:
  On Sun, Dec 01, 2013 at 11:21:30AM +0200, Khaled Hosny wrote:
  
  Interestingly, after I patched Sorts Mill (a FontForge fork) to avoid
  duplicates[1] I ended up with a ‘dotlessi.sc’ glyph, as it turns out
  the font has a dotlessi → regular smallcap i later on, so that is 
  where FontLab gets the glyph name, too.
  
  I’ll try to port this patch to LuaTeX later.
 
 Thanks for handling this! 
 
 When can I expect this fix in luatex.dll updated on my local machine
 using the first-setup script?

I pushed the patch to LuaTeX trunk, so it should be in the next release,
but no idea about which or when.

Regards,
Khaled
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-12-01 Thread Khaled Hosny
On Wed, Nov 27, 2013 at 11:35:01PM +0100, Jan Tosovsky wrote:
 On 2013-11-27 Jan Tosovsky wrote:
  On 2013-11-27 Hans Hagen wrote:
   On 11/27/2013 10:20 PM, Jan Tosovsky wrote:
On 2013-11-27 Hans Hagen wrote:
On 11/27/2013 9:53 PM, Jan Tosovsky wrote:
On 2013-11-27 Hans Hagen wrote:
On 11/27/2013 8:44 PM, Jan Tosovsky wrote:
   
during my attempts to patch the Palatino's dotless 'i' I found
that this font is parsed incorrectly by ConTeXt.
   
Comparing index/name info of individual glyphs in the font
software and resulting pala.tma file there is the following
difference:
   
Index | Name - font| Name - tma
1110  | dotlessi.smcp  | i.sc(1)
1170  | i.smcp | i.sc(2)
   
The first one should have IMHO a different name.
The same name for two glyphs might be dangerous.
   
  
   the fact that there are two i.sc in the font is suspicious ... best
   check the font in fontforge ... one never know what kind of things
   other programs do
  
  Hmm, FontForge glyphs naming corresponds to what we can observe in the
  ConTeXt (doubled i.sc). My previous analysis was based on FontLab. I am
  confused now...
 
 Actually, there are no names of these glyphs available in the font so they
 are calculated(!)

Right, the font (like many MS fonts) uses version 3 ‘post’ table which
includes no glyph names at all, software that needs glyph names (e.g.
LuaTeX, since you can’t embed a font is PDF without glyph names else
printers would go nuts) have to generate it. Some software will use dump
names; glyph1 etc. using glyph id, others will try to guess more
sensible names from the OpenType layout tables.

 Each of two programs uses a different method. FontLab method is based on
 layout tables - GPOS, GSUB, GDEF (it somehow detects that both glyps
 differs). The FontForge method is unclear and seems to be buggy.

FontForge uses the layout tables, too, but this font has a catch, it has
two i → some glyph substitutions in the ‘smcp’ feature, one to a
dotted small cap for Turkish (under TRK tag) and a regular one, and
FontForge just names the resultant glyph ‘i.sc’ in both cases since it
does not seem to check for duplicates, thinking that only one such a
substitution can happen per feature. LuaTeX uses a (subset of) FontForge
internally, so you get the same bug.

It is not clear to me how FontLab arrived to the dotlessi name from the
GSUB table, but I need to look into the font a bit more closer.

Regards,
Khaled
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-12-01 Thread Khaled Hosny
On Sun, Dec 01, 2013 at 11:21:30AM +0200, Khaled Hosny wrote:
 It is not clear to me how FontLab arrived to the dotlessi name from the
 GSUB table, but I need to look into the font a bit more closer.

Interestingly, after I patched Sorts Mill (a FontForge fork) to avoid
duplicates[1] I ended up with a ‘dotlessi.sc’ glyph, as it turns out the
font has a dotlessi → regular smallcap i later on, so that is where
FontLab gets the glyph name, too.

I’ll try to port this patch to LuaTeX later.

Regards,
Khaled

[1] 
https://bitbucket.org/sortsmill/sortsmill-tools/commits/a7fdc1cd13d94659fe90848d0fe2878bbdd54d60
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-28 Thread Hans Hagen

On 11/27/2013 11:35 PM, Jan Tosovsky wrote:

On 2013-11-27 Jan Tosovsky wrote:

On 2013-11-27 Hans Hagen wrote:

On 11/27/2013 10:20 PM, Jan Tosovsky wrote:

On 2013-11-27 Hans Hagen wrote:

On 11/27/2013 9:53 PM, Jan Tosovsky wrote:

On 2013-11-27 Hans Hagen wrote:

On 11/27/2013 8:44 PM, Jan Tosovsky wrote:


during my attempts to patch the Palatino's dotless 'i' I found
that this font is parsed incorrectly by ConTeXt.

Comparing index/name info of individual glyphs in the font
software and resulting pala.tma file there is the following
difference:

Index | Name - font| Name - tma
1110  | dotlessi.smcp  | i.sc(1)
1170  | i.smcp | i.sc(2)

The first one should have IMHO a different name.
The same name for two glyphs might be dangerous.




the fact that there are two i.sc in the font is suspicious ... best
check the font in fontforge ... one never know what kind of things
other programs do


Hmm, FontForge glyphs naming corresponds to what we can observe in the
ConTeXt (doubled i.sc). My previous analysis was based on FontLab. I am
confused now...


Actually, there are no names of these glyphs available in the font so they
are calculated(!)
Each of two programs uses a different method. FontLab method is based on
layout tables - GPOS, GSUB, GDEF (it somehow detects that both glyps


that is okay to make names unique, although there can still be multiple 
variants so in fact i.smcp and i.ss01.4 are valid names then, but .smcp 
and .onum are not understood by name parsers (for adobe glyph names)



differs). The FontForge method is unclear and seems to be buggy. But we


some kind of numbering would make more sense i.1 or so


should blame rather the font itself as it is the primary cause of these
problems (= missing glyph names).


indeed

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


[NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Jan Tosovsky
Dear All,

during my attempts to patch the Palatino's dotless 'i' I found that this
font is parsed incorrectly by ConTeXt.

Comparing index/name info of individual glyphs in the font software and
resulting pala.tma file there is the following difference:

Index | Name - font| Name - tma
1110  | dotlessi.smcp  | i.sc(1)
1170  | i.smcp | i.sc(2)

(2) - this is a composite character which consist of dotlessi.smcp and dot.

The first one should have IMHO a different name, e.g. dotlessi.sc (to keep
conventions). The same name for two glyphs might be dangerous.

Regards, Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Hans Hagen

On 11/27/2013 8:44 PM, Jan Tosovsky wrote:

Dear All,

during my attempts to patch the Palatino's dotless 'i' I found that this
font is parsed incorrectly by ConTeXt.

Comparing index/name info of individual glyphs in the font software and
resulting pala.tma file there is the following difference:

Index | Name - font| Name - tma
1110  | dotlessi.smcp  | i.sc(1)
1170  | i.smcp | i.sc(2)

(2) - this is a composite character which consist of dotlessi.smcp and dot.

The first one should have IMHO a different name, e.g. dotlessi.sc (to keep
conventions). The same name for two glyphs might be dangerous.


the font pala.ttf has two entries i.sc and i see no reference to *.smcp

(mtxrun --script --save pala.ttf)

naming of glyphs is somewhat fuzzy and not always consistentent in fonts 
but i fear there is not much we can do here (apart from using palatino 
nova instead)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Jan Tosovsky
On 2013-11-27 Hans Hagen wrote:
 On 11/27/2013 8:44 PM, Jan Tosovsky wrote:
 
  during my attempts to patch the Palatino's dotless 'i' I found that
  this font is parsed incorrectly by ConTeXt.
 
  Comparing index/name info of individual glyphs in the font software
  and resulting pala.tma file there is the following difference:
 
  Index | Name - font| Name - tma
  1110  | dotlessi.smcp  | i.sc(1)
  1170  | i.smcp | i.sc(2)
 
  (2) - this is a composite character which consist of dotlessi.smcp
  and dot.
 
  The first one should have IMHO a different name, e.g. dotlessi.sc (to
  keep conventions). The same name for two glyphs might be dangerous.
 
 the font pala.ttf has two entries i.sc and i see no reference to *.smcp

The version of my Palatino is 5.0 (I run on Win7)
It is located at c:/windows/fonts/pala.ttf  

There is no 'i.sc' glyph available according to the font software, only
those .smcp, listed in the smcp6 table.

As there are only .sc names in the TMA file, I suppose there is some kind of
name normalization. But not very precise...

 (mtxrun --script --save pala.ttf)

This returns an error:
c:/windows/fonts/pala.ttf:1: unexpected symbol

Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Hans Hagen

On 11/27/2013 9:53 PM, Jan Tosovsky wrote:

On 2013-11-27 Hans Hagen wrote:

On 11/27/2013 8:44 PM, Jan Tosovsky wrote:


during my attempts to patch the Palatino's dotless 'i' I found that
this font is parsed incorrectly by ConTeXt.

Comparing index/name info of individual glyphs in the font software
and resulting pala.tma file there is the following difference:

Index | Name - font| Name - tma
1110  | dotlessi.smcp  | i.sc(1)
1170  | i.smcp | i.sc(2)

(2) - this is a composite character which consist of dotlessi.smcp
and dot.

The first one should have IMHO a different name, e.g. dotlessi.sc (to
keep conventions). The same name for two glyphs might be dangerous.


the font pala.ttf has two entries i.sc and i see no reference to *.smcp


The version of my Palatino is 5.0 (I run on Win7)
It is located at c:/windows/fonts/pala.ttf


i checked on windows 8


There is no 'i.sc' glyph available according to the font software, only
those .smcp, listed in the smcp6 table.

As there are only .sc names in the TMA file, I suppose there is some kind of
name normalization. But not very precise...


(mtxrun --script --save pala.ttf)


mtxrun --script font --save pala.ttf


This returns an error:
c:/windows/fonts/pala.ttf:1: unexpected symbol

Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___




--

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Jan Tosovsky
On 2013-11-27 Hans Hagen wrote:
 On 11/27/2013 9:53 PM, Jan Tosovsky wrote:
  On 2013-11-27 Hans Hagen wrote:
  On 11/27/2013 8:44 PM, Jan Tosovsky wrote:
 
  during my attempts to patch the Palatino's dotless 'i' I found that
  this font is parsed incorrectly by ConTeXt.
 
  Comparing index/name info of individual glyphs in the font software
  and resulting pala.tma file there is the following difference:
 
  Index | Name - font| Name - tma
  1110  | dotlessi.smcp  | i.sc(1)
  1170  | i.smcp | i.sc(2)
 
  The first one should have IMHO a different name, e.g. dotlessi.sc
  (to keep conventions). The same name for two glyphs 
  might be dangerous.
 
  the font pala.ttf has two entries i.sc and i see no reference to
  *.smcp
 
  There is no 'i.sc' glyph available according to the font software,
  only those .smcp, listed in the smcp6 table.
 
  As there are only .sc names in the TMA file, I suppose there is some
  kind of name normalization. But not very precise...

 mtxrun --script font --save pala.ttf

I can confirm your observations. In this lua export there is no .smcp, but
doubled i.sc records. Strange. There must be really some kind of
normalization there...

It would be nice to review the corresponding part of the code as it is IMHO
potentially dangerous. 

I felt obliged to report it :-)

Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Hans Hagen

On 11/27/2013 10:20 PM, Jan Tosovsky wrote:

On 2013-11-27 Hans Hagen wrote:

On 11/27/2013 9:53 PM, Jan Tosovsky wrote:

On 2013-11-27 Hans Hagen wrote:

On 11/27/2013 8:44 PM, Jan Tosovsky wrote:


during my attempts to patch the Palatino's dotless 'i' I found that
this font is parsed incorrectly by ConTeXt.

Comparing index/name info of individual glyphs in the font software
and resulting pala.tma file there is the following difference:

Index | Name - font| Name - tma
1110  | dotlessi.smcp  | i.sc(1)
1170  | i.smcp | i.sc(2)

The first one should have IMHO a different name, e.g. dotlessi.sc
(to keep conventions). The same name for two glyphs
might be dangerous.


the font pala.ttf has two entries i.sc and i see no reference to
*.smcp


There is no 'i.sc' glyph available according to the font software,
only those .smcp, listed in the smcp6 table.

As there are only .sc names in the TMA file, I suppose there is some
kind of name normalization. But not very precise...


mtxrun --script font --save pala.ttf


I can confirm your observations. In this lua export there is no .smcp, but
doubled i.sc records. Strange. There must be really some kind of
normalization there...


the fact that there are two i.sc in the font is suspicious ... best 
check the font in fontforge ... one never know what kind of things other 
programs do



It would be nice to review the corresponding part of the code as it is IMHO
potentially dangerous.


afaik no magic there

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Jan Tosovsky
On 2013-11-27 Hans Hagen wrote:
 On 11/27/2013 10:20 PM, Jan Tosovsky wrote:
  On 2013-11-27 Hans Hagen wrote:
  On 11/27/2013 9:53 PM, Jan Tosovsky wrote:
  On 2013-11-27 Hans Hagen wrote:
  On 11/27/2013 8:44 PM, Jan Tosovsky wrote:
 
  during my attempts to patch the Palatino's dotless 'i' I found 
  that this font is parsed incorrectly by ConTeXt.
 
  Comparing index/name info of individual glyphs in the font
  software and resulting pala.tma file there is the following 
  difference:
 
  Index | Name - font| Name - tma
  1110  | dotlessi.smcp  | i.sc(1)
  1170  | i.smcp | i.sc(2)
 
  The first one should have IMHO a different name, e.g. dotlessi.sc
  (to keep conventions). The same name for two glyphs
  might be dangerous.
 
 
 the fact that there are two i.sc in the font is suspicious ... best
 check the font in fontforge ... one never know what kind of things
 other programs do

Hmm, FontForge glyphs naming corresponds to what we can observe in the
ConTeXt (doubled i.sc). My previous analysis was based on FontLab. I am
confused now...

Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] [***SPAM***] Incorrect internal font processing

2013-11-27 Thread Jan Tosovsky
On 2013-11-27 Jan Tosovsky wrote:
 On 2013-11-27 Hans Hagen wrote:
  On 11/27/2013 10:20 PM, Jan Tosovsky wrote:
   On 2013-11-27 Hans Hagen wrote:
   On 11/27/2013 9:53 PM, Jan Tosovsky wrote:
   On 2013-11-27 Hans Hagen wrote:
   On 11/27/2013 8:44 PM, Jan Tosovsky wrote:
  
   during my attempts to patch the Palatino's dotless 'i' I found
   that this font is parsed incorrectly by ConTeXt.
  
   Comparing index/name info of individual glyphs in the font
   software and resulting pala.tma file there is the following
   difference:
  
   Index | Name - font| Name - tma
   1110  | dotlessi.smcp  | i.sc(1)
   1170  | i.smcp | i.sc(2)
  
   The first one should have IMHO a different name.
   The same name for two glyphs might be dangerous.
  
 
  the fact that there are two i.sc in the font is suspicious ... best
  check the font in fontforge ... one never know what kind of things
  other programs do
 
 Hmm, FontForge glyphs naming corresponds to what we can observe in the
 ConTeXt (doubled i.sc). My previous analysis was based on FontLab. I am
 confused now...

Actually, there are no names of these glyphs available in the font so they
are calculated(!)
Each of two programs uses a different method. FontLab method is based on
layout tables - GPOS, GSUB, GDEF (it somehow detects that both glyps
differs). The FontForge method is unclear and seems to be buggy. But we
should blame rather the font itself as it is the primary cause of these
problems (= missing glyph names).

Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___