I don’t see an obvious problems with your class, you might want to switch out
the following:
catch (IOException ioe)
{
ioe.printStackTrace();
}
and replace it with:
catch (IOException e)
{
throw new RuntimeException(e)
}
so that at least you’re not silently consuming exceptions - just in case.
I presume your using the same test PDF file, etc?
-- John
On 25 Jul 2014, at 13:48, Aaron Hartman <[email protected]> wrote:
> John,
> I’m on it (tracking it down). I didn’t make any changes to anything related
> to what PDFBox was doing I didn’t think; but of course could be wrong.
>
> My first instinct was to download the new 1.8.6 (was using 1.8.5) but I get
> the same result. I am currently looking at the other extended TextStripper
> classes for some insight - but given that this stripper was working
> previously I’m not sure what outside of this class could be affecting its
> result.
>
> I have attached my extended class in a text document. If there is anything
> glaring within there please let me know - I am going to start tracing the
> usage paths to that class.
>
>
> Thanks!
>
> -Aaron
> <Stripper.txt>
>
> On Jul 25, 2014, at 2:33 PM, John Hewson <[email protected]> wrote:
>
>> Hi Aaron
>>
>> You’re probably going to have to track down the change that caused your
>> code to stop functioning, are you working against the 2.0 trunk? There have
>> been a number of changes recently which affect graphics state and text
>> extraction.
>>
>> If you are working against the trunk then try checking out the latest version
>> and setting a conditional breakpoint where you expect the red colour in
>> processTextPosition and see if it gets hit: if not then it could be a new bug
>> in PDFBox or some internal quirk of how you’re detecting red, in which case
>> you might want to share the relevant line(s) of code.
>>
>> Cheers
>>
>> -- John
>>
>> On 25 Jul 2014, at 13:15, -A <[email protected]> wrote:
>>
>>> Hi again, everyone-
>>>
>>> Finishing up this program I am working on and heading back to the testing
>>> phase - and suddenly my program is not detecting red text within PDF's. The
>>> old method was just to override the TextStripper class and implement a
>>> containsRed method that basically loops through every page and processes
>>> the stream. I over-rode the processTextPosition method to check for Red
>>> stroking colors at the given position.
>>>
>>> This was working. I had to also use a plain TextStripper class as my
>>> extended version for some reason would error out getting all of the text
>>> from the file. Just wanted to give some background that in my PDF class
>>> that I created I am using two TextStrippers (thought they may be
>>> conflicting). One to get all of the text, the other to see if there is red
>>> within the text.
>>>
>>> I am trying to debug this but I have stepped through the entire files text
>>> position to some actual red text - and it just shows up in the IDE as
>>> System Grey, I believe (or some variant of that).
>>>
>>> It is perfectly plausible that I changed something inadvertently - but by
>>> chance would any of you have any clue as to why it may not be seeing the
>>> red text now?
>>>
>>>
>>> Thank you for your guys' time!
>>>
>>> Sincerely,
>>> Aaron
>>>
>>>
>>> P.S. If John Hewson ends up responding to this feel free to write me
>>> directly if it is more convenient.
>>
>