Hi,
Am 01.12.2010 22:40, schrieb martijn.list:
The PDF document attached to bug report
https://issues.apache.org/jira/browse/PDFBOX-816 (TaxReturn-1.pdf)
throws a NumberFormatException.
#getEncodingFromFont uses a StringTokenizer to split a line into
separate tokens:
StringTokenizer st = new StringTokenizer(line);
The following line however results in a NumberFormatException because
0/NUL is read as one token.
dup 0/NUL put
The StringTokenizer only accepts the following chars as line delimiters:
" \t\n\r\f".
I think this is not correct because it seems that some delimiter chars
are missing like (, ),<,>, [, ], {, }, /, and %
Hmm, the problem are not the unsupported delimiter chars, it's the missing space
character in the input line.
dup 0/NUL put -> dup 0 /NUL put
Is this a bug?
Yes, it is. I filed an issue on JIRA [1] and fixed the problem by replacing each
"/" with " /" to ensure that there will be a delimiter at the right place.
Thanks, for reporting!
BR Andreas Lehmkühler
[1] https://issues.apache.org/jira/browse/PDFBOX-921