Le lundi 20 août 2007 à 11:05 -0700, Carl Worth a écrit : > On Sun, 19 Aug 2007 22:46:16 +0200, Laurent Aguerreche wrote: > > But the real problem is that it is impossible to recognize : > > - "fi" as "fi" too > > - "ff" as "ff" too > > Would it be possible to add a new parameter to pdftotext to make it > > ignore ligatures but still export in UTF-8? > > It's quite preferable to have the ligatures in your PDF file.
When it is correctly rendered it is great! > The bug to fix is that poppler should expand the ligatures to their > normalized forms when extracting the text. > > That bug was first reported here: > > Text extraction should expand ligatures to their normal form > https://bugs.freedesktop.org/show_bug.cgi?id=7002 Ok. So as a Tracker point of vue it means that it won't have to convert a "ff" character as input for search from an user to "ff" ; users will have to only input "ff". But on the Poppler side, is bug #7002 close to be fixed? ;-) Laurent. > -Carl
signature.asc
Description: Ceci est une partie de message numériquement signée
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
