Hi Aaron You’re probably going to have to track down the change that caused your code to stop functioning, are you working against the 2.0 trunk? There have been a number of changes recently which affect graphics state and text extraction.
If you are working against the trunk then try checking out the latest version and setting a conditional breakpoint where you expect the red colour in processTextPosition and see if it gets hit: if not then it could be a new bug in PDFBox or some internal quirk of how you’re detecting red, in which case you might want to share the relevant line(s) of code. Cheers -- John On 25 Jul 2014, at 13:15, -A <[email protected]> wrote: > Hi again, everyone- > > Finishing up this program I am working on and heading back to the testing > phase - and suddenly my program is not detecting red text within PDF's. The > old method was just to override the TextStripper class and implement a > containsRed method that basically loops through every page and processes > the stream. I over-rode the processTextPosition method to check for Red > stroking colors at the given position. > > This was working. I had to also use a plain TextStripper class as my > extended version for some reason would error out getting all of the text > from the file. Just wanted to give some background that in my PDF class > that I created I am using two TextStrippers (thought they may be > conflicting). One to get all of the text, the other to see if there is red > within the text. > > I am trying to debug this but I have stepped through the entire files text > position to some actual red text - and it just shows up in the IDE as > System Grey, I believe (or some variant of that). > > It is perfectly plausible that I changed something inadvertently - but by > chance would any of you have any clue as to why it may not be seeing the > red text now? > > > Thank you for your guys' time! > > Sincerely, > Aaron > > > P.S. If John Hewson ends up responding to this feel free to write me > directly if it is more convenient.
