Hi,

Am 04.11.2011 20:34, schrieb Zhang, Lisheng:
Hi Mike,

Thanks very much, I tested and result is the same, from source code
it seems that suppressDuplicateOverlappingText parameter does not
have effect if I call PDFTextStripper.getText(..) directly. I will
check more to see if I can use method processEncodedText(..).

Which version of PDFBox did you use (Tika has not used PDFBox 1.5 yet)?
According to [1] Tika 0.10 uses PDFBox 1.6. which includes some improvements related to performance.

Best regards, Lisheng
<SNIP>


BR
Andreas Lehmkühler
[1] http://www.apache.org/dist/tika/CHANGES-0.10.txt

Reply via email to