Hello, I have found an issue in the latest version of PDFBox where parsing fails in the BaseParser when `parseDirObject` parses a number and the following string starts with an 'e'.
This is due to the attempt to include numbers stored in scientific notation. I have found one way that seems to resolve this problem is by checking if the last character in the read number string is an e or E. If it is then removing it from the read string and unreading it from the source allows parsing to complete as expected. ``` private COSNumber parseCOSNumber() throws IOException { ... // Remove last character if it is not a number char lastc = buf.charAt(buf.length() - 1); if (lastc == 'e' || lastc == 'E') { buf.deleteCharAt(buf.length() - 1); seqSource.unread(lastc); } return COSNumber.get(buf.toString()); } ``` An example of this error can be seen in PDF.js issue3323. https://github.com/mozilla/pdf.js/commit/26f5b1b2d37c7b74a073dee75d66fcc04fae10e8 https://github.com/mozilla/pdf.js/blob/4ba28de2608866dcb10d627d77dc19ff3d017c17/test/pdfs/issue3323.pdf I can contribute the change if needed, but will need to go through the contribution guides and run further validation to confirm this change won't break any other workflows. Thanks, Cody Holmes