Re: Getting text strings as individual characters in some files

Tilman Hausherr Tue, 27 Oct 2015 11:46:06 -0700

Am 27.10.2015 um 19:25 schrieb Joel Hirsh:

Doing text extraction with PDFTextStripper and overriding writeString to
get individual strings.


I have some files that in 1.8 gave the strings that I would expect, but in
2.0 each character is coming to writeString as a separate string.

In one such file, the first page extracts as expected, but pages 2 and on
get the strings broken up into characters.

In another file, everything is broken up.

Is this considered a bug?   Or is there any control over what might be
causing that?


The best would be to
1) verify that it still happens with the current snapshot
2) if yes, please open an issue.

This applies also to your follow up message. We fixed several problemsin the last few days, but it is quite possible that there are stillsome, so we need the file.


Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Getting text strings as individual characters in some files

Reply via email to