A Dilluns 12 Maig 2008, Ross Moore va escriure: > Hi Albert, > > On 04/05/2008, at 9:07 AM, Albert Astals Cid wrote: > > A Diumenge 04 Maig 2008, Albert Astals Cid va escriure: > >> Like Ross pdf showed, we have a maximum limit of 8 char for the > >> representation of a glyph, so even there's a char that identifies > >> itself as > >> \rightarrow pdftotext only gives \rightar > >> > >> I'm fixing this hardcoded limit with the attached patch. As side > >> effects > >> we're having a speed boost as i stop copying things when calling > >> CharCodeToUnicode::mapToUnicode and lower memory usage as for each > >> CharCodeToUnicodeString now only the exact memory needed is used, > >> not a > >> fixed 8 like before. > >> > >> I'm attaching the patch for further review. If noone disagrees > >> i'll commit > >> on sunday 11. > > No disagreement from me. > I've applied the patch, and the earlier ones related to Annotations, > etc. > > All the utils/pdfto* work much better (no Bus Error) with my > example PDFs, > except for pdfimages (which generates image files of size 8 bytes !)
That wasn't working before either, please open a bug about it on bugs.freedesktop.org so we can work on it for the future. > > > Thanks very much for your work on this. > > > However, there are still some problems with the actual text strings > extracted using pdftotext . You mean the problems with "Introduction A flat stable plane (��, ℒ)" etc? Adobe Reader can neither get text from there so i'm not sure it's completely our fault. Albert _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
