I had the latest revision. When I compiled everything from the
command-line and started PDFReader from there, everything looked fine.
Due to bad experiences with the Eclipse Maven plug-ins, I set up the
PDFBox project by hand. And in that case I get the characters on top of
each other. I don't know, yet, where the difference is.

While going through this experiment, I noticed that it's currently not
that easy to compile PDFBox and just run PDFReader without setting up a
batch script first with the right classpath. The instructions on [1] are
also incorrect, as PDFBox doesn't have a ClassPath manifest entry (which
is good really). I guess we could add additional Ant targets to run the
various command-line tools. Batik does that. That would make it easier
for people to evaluate PDFBox quickly. Maybe I'll have time to look into
this at some point (no promises just yet).

[1] http://pdfbox.apache.org/commandlineutilities/PDFReader.html

On 01.10.2010 17:01:34 Andreas Lehmkühler (JIRA) wrote:
> What version are you using? The latest trunk version (1003396) includes
> a fix for the extraction/rendering of text and one of the key issues
> was the handling of the TJ operator. See PDFBOX-828 for further details. 
> After applying your proposed patch to the latest trunk everything seems
> to be fine. I can't see any problem with the TJ operator. I'm attaching
> the result of PDFToImage.




Jeremias Maerki

Reply via email to