Hi, I have a set of PDFs which i wish to process and underline specific words. I will be using the fill rectangle method with a small height to underline words or phrases.
These input PDFs come from different sources and hence, their font type cannot be guaranteed. I use the PDFTextStripper functionality to get a serier of text position object. Each text position object gives me various attributes like the x-coordinate, y-coordinate, font, font-type, font size, width, height, x-scale, etc. For example consider the following: String[75.0,278.8 fs=10.0 xscale=1.0 height=7.0000005 space=5.830001 width=108.87001]Primary Diagnosis: elder Here some of the attributes of text position object for the string "Primary Diagnosis: elder" are given. Other attributes like Font, font size, etc. are also available for this string. Now the problem arises when I have to underline just the word "Diagnosis". For this I would be requiring the starting x-coordinate of "Diagnosis" and its width. To sum it up, I wish to find that for any PDF with different set of fonts, how can I calculate the width of any substring?
