[iText-questions] PDF file on steroids

Sternbergh, Cornell Wed, 21 Jul 2004 11:34:24 -0700

Good Afternoon

This afternoon we had a problem.  We created a 38 Meg PDF which
shouldn't be more than 136K.


We have a Java process which reads data from a database and formats into
a PDF.  In the past 9 days we've created 9,700 documents, the largest is
136K, the smallest is 24K.  38M is way out of line.  We use iText to
generate the PDF.

As we've created more than 120,000 documents with this process without a
hitch, we don't expect so much a coding problem as some data related
problem.

And I need to get a lead on where to start.

This came to our attention because a second process, which prints or
emails the PDF, got hung up on this file, and distribution of other
files got held up.  The file size was approximately 37.9M.

We tried to open the PDF with Adobe Acrobat, which claims the file needs
repair and then says it can't be repaired.

We deleted the document from the queue and deleted the file, and the
process continuted handling other files.  Someone requested that this
document be re-printed, which caused the process to recreate it.  This
time, it was approximately 38M.

First we note that we've created two PDF's from the same data, but the
size of the resulting files are different, not to mention way too large.

Second, we observe, from looking at the data in the database, that this
document's data is almost identical to another, successfully generated
document, except that one field has more characters in it.  This string
would be retrieved from the database (DB2) and then put into a cell in a
table.  Could it be possible that we've blown a length limitation on
table cells?

Is there a convenient way for a human to understand the contents of a
PDF file?  We opened it in UltraEdit (a text/hex editor) and note that
there are no strings which correspond to the text of the document.  I do
observe ASCII strings such as (0Ah refers to hex 0A):
  /Subtype /Image0Ah
  /Type /XObject0Ah
  /Filter /FlateDecode0Ah
  /Width 2830Ah
  /Height 2980Ah
  /BitsPerComponent 80Ah
  /Length 55140Ah
  /ColorSpace
which I assume to be PDF commands or attributes.

How are the strings of characters which make up the actual text stored?

TIA
Cornell Sternbergh


-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_idG21&alloc_id040&op=click
_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

[iText-questions] PDF file on steroids

Reply via email to