There is a bunch of code to load, and where there is none, or there are
holes, it will guess the unicode mapping, and where that fails it will map
to the "private" unicode space.  This code is in the Gfx8BitFont
constructor.  I have recently made a number of modifications (which I can
share on request) to ensure that each character code is mapped uniquely,
and is optionally mapped only to a single Unicode character (i.e. the fl
ligature is not mapped to f and l, but the unicode 0xFB02.

If the unicode mapping functionality is useful elsewhere in the codebase,
then we may want to factor it out of the constructor of Gfx8Bit, or
perhaps it would be good enough to construct a font, and use the resulting
unicode mapping.  Maybe I'm missing the whole point, but just wanted to
let you know about this.

--josh

On 11/14/11 2:37 PM, "Max Filippov" <[email protected]> wrote:

>> There are changes in that code added in xpdf 3.02, I'm not sure they
>> fix your issue though, but you might want to take a look, see:
>> 
>> 
>>http://cgit.freedesktop.org/~carlosgc/poppler-xpdf3merge/tree/ALL_DIFF#n6
>>462
>
>Thanks for the answer.
>I've taken a look at the patch and 17 lines below the place you've
>spotted there's a comment:
>
>  //~ this currently drops all non-Latin1 characters
>
>which is 100% accurate, non-latin-1 symbols are replaced with question
>marks.
>So, I'm still searching for guidance before I possibly reinvent a wheel (:
>
>Thanks.
>-- Max
>_______________________________________________
>poppler mailing list
>[email protected]
>http://lists.freedesktop.org/mailman/listinfo/poppler
>

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to