I do agree that it check a lot more..
I think I somehow narrowed the problem to these lines:
PdfReader reader = new
PdfReader(this.getResponseAsBytes(connection.getInputStream()));
it *sometimes *throws the NullPointerException, other times it works
just fine.
I assume that the problem is not itext at all, but the way the
InputStream is returned.
Seems like itext wants to use it while it is not completely ready for
usage (returned in chunks may be?)
I switched to Apache Http commons for getting the InputStream and it's
working fine so far.
Nevertheless, here is the pdf :
http://cetatenie.just.ro/wp-content/uploads/Ordin-58C-din-10.01.2013.pdf
Cheers,
Eugene.
On 8/14/13 3:44 AM, Paulo Soares wrote:
Please post your PDF, I suspect it is broken. 5.4.3 checks a lot more
things than 2.1.7.
Paulo
On Wed, Aug 14, 2013 at 7:03 PM, eugene <eugen.ra...@gmail.com> wrote:
So I am trying to do a very simple read of a pdf file. As simple as this:
public String getSource(String pdfLink) throws IOException {
log.debug("Getting the sources for pdfLink = {}", pdfLink);
URL url = new URL(pdfLink);
URLConnection connection = url.openConnection();
//60 seconds connection timeout
connection.setConnectTimeout(1000 * 60);
PdfReader reader = new
PdfReader(this.getResponseAsBytes(connection.getInputStream()));
int numberOfPages = reader.getNumberOfPages();
StringBuilder builder = new StringBuilder();
for(int i=1;i<=numberOfPages;++i){
String result = PdfTextExtractor.getTextFromPage(reader, i);
System.out.println("AAAAAAAAAA = "+result);
builder.append(result);
}
return builder.toString();
}
/**
* read chunks of 1024 bytes from input stream and put them into
the output stream
*/
private byte[] getResponseAsBytes(InputStream inputStream) throws
IOException{
byte[] bytes = new byte[1024];
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
while(inputStream.read(bytes)!=-1) outputStream.write(bytes);
return outputStream.toByteArray();
}
as a result I get:
Exception in thread "main" java.lang.NullPointerException
at com.itextpdf.text.pdf.DocumentFont.doType1TT(DocumentFont.java:400)
at com.itextpdf.text.pdf.DocumentFont.init(DocumentFont.java:128)
at com.itextpdf.text.pdf.DocumentFont.<init>(DocumentFont.java:113)
at
com.itextpdf.text.pdf.CMapAwareDocumentFont.<init>(CMapAwareDocumentFont.java:99)
at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.getFont(PdfContentStreamProcessor.java:157)
at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.access$4200(PdfContentStreamProcessor.java:79)
at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor$SetTextFont.invoke(PdfContentStreamProcessor.java:612)
at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.invokeOperator(PdfContentStreamProcessor.java:267)
at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.processContent(PdfContentStreamProcessor.java:387)
at
com.itextpdf.text.pdf.parser.PdfReaderContentParser.processContent(PdfReaderContentParser.java:79)
at
com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:73)
at
com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:88)
at
com.eugen.romina.v2.pdf.source.generator.FromPDFLinkToString.getSource(FromPDFLinkToString.java:43)
at
com.eugen.romina.v2.pdf.source.generator.FromPDFLinkToString.main(FromPDFLinkToString.java:63)
Funny thing, If I downgrade to 2.1.7 everything works fine.
This very much sounds like a bug to me.
Cheers,
Eugene.
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php