Hi All,
We had extended the algorithm in the following link to highlight text
for PDFBox 2.x version.
Link:https://gist.github.com/joelkuiper/331a399961941989fec8It was
originally written for PDFBox 1.8.x.
For some documents, it failed to highlight the given text. On debugging,
we found that, it could not match the text in that page due to
characters "ffi" present in the search string.
Complete search string is:
"efficiently. Fast trigger mechanisms are needed to curate events of
interest online and\nsensitive statistical tools are needed to extract
as much"
Actually, the above string is present in the PDF file. However, we can
highlight the sub string, after removing the first characters in the
search string.
Thanks in advance.
- CM