Le lundi 20 août 2007 à 11:05 -0700, Carl Worth a écrit :
> On Sun, 19 Aug 2007 22:46:16 +0200, Laurent Aguerreche wrote:
> > But the real problem is that it is impossible to recognize :
> > - "fi" as "fi" too
> > - "ff" as "ff" too
> > Would it be possible to add a new parameter to pdftotext to make it
> > ignore ligatures but still export in UTF-8?
> 
> It's quite preferable to have the ligatures in your PDF file.

When it is correctly rendered it is great!

> The bug to fix is that poppler should expand the ligatures to their
> normalized forms when extracting the text.
> 
> That bug was first reported here:
> 
>       Text extraction should expand ligatures to their normal form
>       https://bugs.freedesktop.org/show_bug.cgi?id=7002

Ok.

So as a Tracker point of vue it means that it won't have to convert a
"ff" character as input for search from an user to "ff" ; users will have
to only input "ff".

But on the Poppler side, is bug #7002 close to be fixed?  ;-)


Laurent.

> -Carl

Attachment: signature.asc
Description: Ceci est une partie de message numériquement signée

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to