Re: [ft-devel] GF's cmap fails
> > Well, the glyphs aren't `loaded'; you are rather collecting the file > offsets in an array. > Yes. > > It loads glyphs in the increasing order of their character code. > > Not necessarily: The order of `char_loc' commands could be arbitrary, > say, > > Character 3: dx 3801088 (58), width 728179 (57.65427), loc 11007 > Character 4: dx 3604480 (55), width 699053 (55.34819), loc 11226 > Character 0: dx 3407872 (52), width 655362 (51.88892), loc 10368 > Character 1: dx 4521984 (69), width 873816 (69.18521), loc 10521 > Character 2: dx 4259840 (65), width 815562 (64.57289), loc 10736 > Character 5: dx 4063232 (62), width 786434 (62.2), loc 11365 > Yes. > And as the `indices table' is formed in such a way, it will remain > > unsorted, > > You have to do sorting again! I wrote in my last e-mail that you need > *two* *separate* arrays so that you can replace the `linear searching' > with accessing a pre-sorted array instead. > Oh yes! I get it. Thanks. Parth ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] GF's cmap fails
>> GF provides a natural order of glyphs within the font file, we call >> this `glyph indices'. Each glyph index is associated with a file >> offset. For each glyph, GF assigns a character code to it. We >> thus have immediately a mapping from glyph indices to character >> codes. > > My changes to `cmap' were done to load glyphs in the order they > appear in the font file, This is OK. > as the `gf driver' uses the offsets taken from the `char_loc' > commands in the `gf' file to get the metrics, Yes. > now when the driver goes in the loop to get all the `char_loc' > values ( the chardx and chardy values ), it loads the glyphs in the > order they appear in the `char_loc' command, Well, the glyphs aren't `loaded'; you are rather collecting the file offsets in an array. > which is different from the order they appear in the font file. Yes. > It loads glyphs in the increasing order of their character code. Not necessarily: The order of `char_loc' commands could be arbitrary, say, Character 3: dx 3801088 (58), width 728179 (57.65427), loc 11007 Character 4: dx 3604480 (55), width 699053 (55.34819), loc 11226 Character 0: dx 3407872 (52), width 655362 (51.88892), loc 10368 Character 1: dx 4521984 (69), width 873816 (69.18521), loc 10521 Character 2: dx 4259840 (65), width 815562 (64.57289), loc 10736 Character 5: dx 4063232 (62), width 786434 (62.2), loc 11365 > Now, what I did is, I took the offsets and sorted them and then in > the same order loaded the `glyph indices' thus fulfilling the goal. Yes. > And as the `indices table' is formed in such a way, it will remain > unsorted, You have to do sorting again! I wrote in my last e-mail that you need *two* *separate* arrays so that you can replace the `linear searching' with accessing a pre-sorted array instead. >> [For simplicity, I'm ignoring the still missing artificial glyphs >> that must be inserted at glyph indices 0 and 1, as mentioned in a >> previous e-mail.] > > This is in the pipeline and will be added soon :-) Very good. Werner ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] GF's cmap fails
> > > After some debugging, I found out that, I was using binary search on > > the encodings array in the `char_index' function, although it wasn't > > sorted (so foolish of me :( ). Now, fixed! Thanks. > > In the commit message, you write > > Use `linear search' instead of `binary search' in the encoding table > as it will always be unsorted. > > This really baffles me. I think you have an error in reasoning > somewhere, still not really understanding how a cmap works. I'll > retry. > > GF provides a natural order of glyphs within the font file, we call > this `glyph indices'. Each glyph index is associated with a file > offset. For each glyph, GF assigns a character code to it. We thus > have immediately a mapping from glyph indices to character codes. > Really sorry for the late reply, I have been involved int some personal issues. My changes to `cmap' were done to load glyphs in the order they appear in the font file, as the `gf driver' uses the offsets taken from the `char_loc' commands in the `gf' file to get the metrics, now when the driver goes in the loop to get all the `char_loc' values ( the chardx and chardy values ), it loads the glyphs in the order they appear in the `char_loc' command, which is different from the order they appear in the font file. It loads glyphs in the increasing order of their character code. Now, what I did is, I took the offsets and sorted them and then in the same order loaded the `glyph indices' thus fulfilling the goal. And as the `indices table' is formed in such a way, it will remain unsorted, i.e. the enoding[0] maps to char code 65 and glyph `A' enoding[1] maps to char code 66 and glyph `B' enoding[2] maps to char code 67 and glyph `C' ... ... ... enoding[57] maps to char code 0 and glyph `Γ' enoding[58] maps to char code 1 and glyph `Δ' enoding[59] maps to char code 2 and glyph `Θ' [For simplicity, I'm ignoring the still missing artificial glyphs that > must be inserted at glyph indices 0 and 1, as mentioned in a previous > e-mail.] > This is in the pipeline and will be added soon :-) Taking `cmr10.600gf' as an example, this yields the following. > >glyph index file offset character code > -- > 034 65 (A) > 1 247 66 (B) > ... ... ... >25 5813 90 (Z) >26 6004 97 (a) > ... ... ... >51 10239 122 (z) >52 10368 0 (Γ) >53 10521 1 (Δ) >54 10736 2 (Θ) > ... ... ... > > What's needed for a cmap, however, is a mapping from character codes > to glyph indices. In other words, you simply have to reverse the > above mapping. > >character code file offset glyph index > -- > 0 (Γ)10368 52 > 1 (Δ)10521 53 > 2 (Θ)10736 54 > ... ...... >65 (A) 34 0 >66 (B) 247 1 > ... ...... >90 (Z) 5813 25 > ... ...... >97 (a) 6004 26 > ... ...... > > And this mapping is of course sorted! > > You should thus have two arrays. > > 1. A table of file offsets, ordered by glyph index: > > file_offsets[num_glyphs] = { 34, 247, ... }; > >This array goes into `GF_Face'. > > 2. A table of glyph indices, ordered by character code: > > glyph_indices[num_chars] = { 52, 53, 54, ... }; > > This eventually leads to > > typedef struct GF_CMapRec_ > { > FT_CMapRec root; > FT_UShort num_chars; > FT_UShort* glyph_indices; > > } GF_CMapRec, *GF_CMap; > > (or something similar), and you do a binary search in `glyph_indices'. > This will load the glyphs in the same order as they would do previously in the order of their charcodes because of the `char_loc' commands, I think I will have to change the `gf_load_font' function according to the cmap scheme above. Thank you Parth ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] GF's cmap fails
> After some debugging, I found out that, I was using binary search on > the encodings array in the `char_index' function, although it wasn't > sorted (so foolish of me :( ). Now, fixed! Thanks. In the commit message, you write Use `linear search' instead of `binary search' in the encoding table as it will always be unsorted. This really baffles me. I think you have an error in reasoning somewhere, still not really understanding how a cmap works. I'll retry. GF provides a natural order of glyphs within the font file, we call this `glyph indices'. Each glyph index is associated with a file offset. For each glyph, GF assigns a character code to it. We thus have immediately a mapping from glyph indices to character codes. [For simplicity, I'm ignoring the still missing artificial glyphs that must be inserted at glyph indices 0 and 1, as mentioned in a previous e-mail.] Taking `cmr10.600gf' as an example, this yields the following. glyph index file offset character code -- 034 65 (A) 1 247 66 (B) ... ... ... 25 5813 90 (Z) 26 6004 97 (a) ... ... ... 51 10239 122 (z) 52 10368 0 (Γ) 53 10521 1 (Δ) 54 10736 2 (Θ) ... ... ... What's needed for a cmap, however, is a mapping from character codes to glyph indices. In other words, you simply have to reverse the above mapping. character code file offset glyph index -- 0 (Γ)10368 52 1 (Δ)10521 53 2 (Θ)10736 54 ... ...... 65 (A) 34 0 66 (B) 247 1 ... ...... 90 (Z) 5813 25 ... ...... 97 (a) 6004 26 ... ...... And this mapping is of course sorted! You should thus have two arrays. 1. A table of file offsets, ordered by glyph index: file_offsets[num_glyphs] = { 34, 247, ... }; This array goes into `GF_Face'. 2. A table of glyph indices, ordered by character code: glyph_indices[num_chars] = { 52, 53, 54, ... }; This eventually leads to typedef struct GF_CMapRec_ { FT_CMapRec root; FT_UShort num_chars; FT_UShort* glyph_indices; } GF_CMapRec, *GF_CMap; (or something similar), and you do a binary search in `glyph_indices'. > Although there is one more error, i.e even after displaying all the > glyphs in the font file `ftview' is displaying extra `AAA...'s, any > reason why this is happening? This is expected. `FT_Get_Char_Index' returns value 0 for character codes that don't have an associated glyph, and currently glyph `A' has index 0. Werner ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] GF's cmap fails
> > >> ./ftview -e "" 20 cmr10.600gf > >> > >> only shows `A' glyphs. [...] > > > > Ok. Currently the GF's cmap is like, the first glyph in the file is > > indexed to 0, and so on. So in cmr10.600gf, `ABCD...' appear first > > so they are now indexed from `0,1,2..' > > This is correct. > > > what happened previously is the glyphs were indexed according to > > their charcode values extracted from the `char_loc' command and this > > was the order `ΓΔΘΛΞΠ...'. I have properly set the encoding values, > > Obviously not. > > > [...] Other options with `-e' except `-e "" ' are giving proper > > output. > > Not at all. They always show the font without any cmap applied, > because there isn't a cmap with the tag you are specifying at the > command line. As soon as you will have implemented a Unicode cmap, > `-e unic' will work also. > > > Any specific reason why `-e "" ' is failing so that I can fix it? > > Set a break point at `FT_Get_Char_Index' and check the return value. > For example, the function returns 0 if argument `charcode' is zero. > This is wrong, of course, since it should return value 52 (which is > the glyph index of glyph `Γ'). Character code 1 should be mapped to > glyph index 53, code 2 to index 54, etc., etc. > After some debugging, I found out that, I was using binary search on the encodings array in the `char_index' function, although it wasn't sorted (so foolish of me :( ). Now, fixed! Thanks. Although there is one more error, i.e even after displaying all the glyphs in the font file `ftview' is displaying extra `AAA...'s, any reason why this is happening? Thank you Parth ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] GF's cmap fails
>> ./ftview -e "" 20 cmr10.600gf >> >> only shows `A' glyphs. [...] > > Ok. Currently the GF's cmap is like, the first glyph in the file is > indexed to 0, and so on. So in cmr10.600gf, `ABCD...' appear first > so they are now indexed from `0,1,2..' This is correct. > what happened previously is the glyphs were indexed according to > their charcode values extracted from the `char_loc' command and this > was the order `ΓΔΘΛΞΠ...'. I have properly set the encoding values, Obviously not. > [...] Other options with `-e' except `-e "" ' are giving proper > output. Not at all. They always show the font without any cmap applied, because there isn't a cmap with the tag you are specifying at the command line. As soon as you will have implemented a Unicode cmap, `-e unic' will work also. > Any specific reason why `-e "" ' is failing so that I can fix it? Set a break point at `FT_Get_Char_Index' and check the return value. For example, the function returns 0 if argument `charcode' is zero. This is wrong, of course, since it should return value 52 (which is the glyph index of glyph `Γ'). Character code 1 should be mapped to glyph index 53, code 2 to index 54, etc., etc. Werner ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] GF's cmap fails
> > [commit 401ce90 -> parthw-cleaned, origin/parthw-cleaned)] > > Parth, > > > calling > > ./ftview -e "" 20 cmr10.600gf > > only shows `A' glyphs. This is incorrect. It should rather start > with `ΓΔΘΛΞΠ...' since `-e ""' invokes the font's internal cmap (this > is, the only cmap that GF currently implements). Ok. Currently the GF's cmap is like, the first glyph in the file is indexed to 0, and so on. So in cmr10.600gf, `ABCD...' appear first so they are now indexed from `0,1,2..' what happened previously is the glyphs were indexed according to their charcode values extracted from the `char_loc' command and this was the order `ΓΔΘΛΞΠ...'. I have properly set the encoding values, but I don't know why `-e ""' ' option is failing. Other options with `-e' except `-e "" ' are giving proper output. Any specific reason why `-e "" ' is failing so that I can fix it? Thank you Parth ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel
[ft-devel] GF's cmap fails
[commit 401ce90 -> parthw-cleaned, origin/parthw-cleaned)] Parth, calling ./ftview -e "" 20 cmr10.600gf only shows `A' glyphs. This is incorrect. It should rather start with `ΓΔΘΛΞΠ...' since `-e ""' invokes the font's internal cmap (this is, the only cmap that GF currently implements). Werner ___ Freetype-devel mailing list Freetype-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/freetype-devel