Not able read/highlight the correctly using PDFBox 2.0.13

CM Reddy Mon, 29 Jul 2019 19:13:18 -0700

Hi All,

We had extended the algorithm in the following link to highlight textfor PDFBox 2.x version.

Link:https://gist.github.com/joelkuiper/331a399961941989fec8It wasoriginally written for PDFBox 1.8.x.

For some documents, it failed to highlight the given text. On debugging,we found that, it could not match the text in that page due tocharacters "ffi" present in the search string.


Complete search string is:

"efficiently. Fast trigger mechanisms are needed to curate events ofinterest online and\nsensitive statistical tools are needed to extractas much"

Actually, the above string is present in the PDF file. However, we canhighlight the sub string, after removing the first characters in thesearch string.


 Thanks in advance.
- CM

Not able read/highlight the correctly using PDFBox 2.0.13

Reply via email to