Hi,

Am 15.05.2013 14:56, schrieb Maruan Sahyoun:
Hi,

currently PDFBox has a number of workarounds "hidden" in the code for real 
world PDFs (e.g. PDFBOX-1172) which are not inline with the spec. There are several 
options to deal with that

e.g.
a) keep the workarounds in the core code
IMO we can't drop them. Whenever a parsing issue arises people often
argue that all pdf readers but PDFbox are able to handle the pdf in
question. So people expect that a pdf reader works in any situation
wether the pdf follows the spec or not. That's sad but that's life :-(

b) throw an exception and stop working
We should add some (special) logging, so that one can detect such glitches.

c) handle it through a pluggable extension
I'm not sure if there is one solution for every use case. Sometimes it's just a
question of the used format (e.g. PDFBOX-1172) and sometimes there are bigger
differences.

WDYT?

Maruan Sahyoun

BR
Andreas Lehmkühler

Reply via email to