I do agree that it check a lot more..

I think I somehow narrowed the problem to these lines:

         PdfReader reader = new
PdfReader(this.getResponseAsBytes(connection.getInputStream()));

it *sometimes *throws the NullPointerException, other times it works just fine. I assume that the problem is not itext at all, but the way the InputStream is returned. Seems like itext wants to use it while it is not completely ready for usage (returned in chunks may be?)

I switched to Apache Http commons for getting the InputStream and it's working fine so far.

Nevertheless, here is the pdf : http://cetatenie.just.ro/wp-content/uploads/Ordin-58C-din-10.01.2013.pdf

Cheers,
Eugene.


On 8/14/13 3:44 AM, Paulo Soares wrote:
Please post your PDF, I suspect it is broken. 5.4.3 checks a lot more
things than 2.1.7.

Paulo

On Wed, Aug 14, 2013 at 7:03 PM, eugene <eugen.ra...@gmail.com> wrote:
So I am trying to do a very simple read of a pdf file. As simple as this:

      public String getSource(String pdfLink) throws IOException {
          log.debug("Getting the sources for pdfLink = {}", pdfLink);
          URL url = new URL(pdfLink);
          URLConnection connection = url.openConnection();
          //60 seconds connection timeout
          connection.setConnectTimeout(1000 * 60);
          PdfReader reader = new
PdfReader(this.getResponseAsBytes(connection.getInputStream()));

          int numberOfPages  = reader.getNumberOfPages();
          StringBuilder builder = new StringBuilder();
          for(int i=1;i<=numberOfPages;++i){
              String result = PdfTextExtractor.getTextFromPage(reader, i);
              System.out.println("AAAAAAAAAA = "+result);
              builder.append(result);
          }
          return builder.toString();
      }


      /**
       * read chunks of 1024 bytes from input stream and put them into
the output stream
       */
      private byte[] getResponseAsBytes(InputStream inputStream) throws
IOException{
          byte[] bytes = new byte[1024];
          ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
          while(inputStream.read(bytes)!=-1) outputStream.write(bytes);
          return outputStream.toByteArray();
      }

as a result I get:

Exception in thread "main" java.lang.NullPointerException
      at com.itextpdf.text.pdf.DocumentFont.doType1TT(DocumentFont.java:400)
      at com.itextpdf.text.pdf.DocumentFont.init(DocumentFont.java:128)
      at com.itextpdf.text.pdf.DocumentFont.<init>(DocumentFont.java:113)
      at
com.itextpdf.text.pdf.CMapAwareDocumentFont.<init>(CMapAwareDocumentFont.java:99)
      at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.getFont(PdfContentStreamProcessor.java:157)
      at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.access$4200(PdfContentStreamProcessor.java:79)
      at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor$SetTextFont.invoke(PdfContentStreamProcessor.java:612)
      at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.invokeOperator(PdfContentStreamProcessor.java:267)
      at
com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.processContent(PdfContentStreamProcessor.java:387)
      at
com.itextpdf.text.pdf.parser.PdfReaderContentParser.processContent(PdfReaderContentParser.java:79)
      at
com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:73)
      at
com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:88)
      at
com.eugen.romina.v2.pdf.source.generator.FromPDFLinkToString.getSource(FromPDFLinkToString.java:43)
      at
com.eugen.romina.v2.pdf.source.generator.FromPDFLinkToString.main(FromPDFLinkToString.java:63)


Funny thing, If I downgrade to 2.1.7 everything works fine.

This very much sounds like a bug to me.

Cheers,
Eugene.

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to