----------------------------------------
> Date: Fri, 11 Dec 2009 22:11:06 -0800
> From: 
> To: [email protected]
> Subject: [iText-questions] PDF "laggy" because of bleeding?
>
>
> Hey all,
>
> I'm currently trying to create a new PDF file from an existing PDF. Thats
> working fine but for some reason PDF's for w4, w9, etc are causing my PDF's
> to be "choppy" or "laggy".

Probably a good thing to have is an instrumented renderer and you can
just plow through it to see what it is doing. This may not reflect the 
time sinks in the Adobe viewer but if you have two docs to compare you
can get some suspects. On a side note, I'm constantly impressed with the viewer
that comes with Debian and there is at least one open source viewer of which I
am aware, probably more if you look. Certainly with java it isn't too hard just 
to
get some idea of what methods are being called more frequently and
if you dump information carefully you may be able to determine the triggers
for text suddently appearing.
 


>
> An example:
> Original: http://www.irs.gov/pub/irs-pdf/fw9.pdf
> Output: http://www.mediafire.com/?ejdozqtmcto
>
> In the output document in Adobe reader, you'll notice when scrolling through
> the pages the pages start off blank and then the content just "appears". The

Computer languages generally ignore time ( with a few notable exceptions and
after thoughts) and you would suspect that anything intended to create printer
pages will be more concerned with optimizing the print speed than the 
visual appeal to someone observing the rendering. I would just ask the PDF
people what general issues and features are here ( link to section in spec will 
suffice).


> only thing I can think of it's because there is hidden text in both PDF's
> that "bleeding" outside the page. The thing is the original document has
> this also, so I don't see why it's not laggy and the new one is. The code
> I'm using is the following:
>
> PdfReader reader = new PdfReader( "fw9_original.pdf" );
>
> Document document = Document( reader.getPageSizeWithRotation( 1 ) );
>
> PdfWriter writer = null;
> try {
> writer = PdfWriter.getInstance( document, new FileOutputStream(
> "fw9_output.pdf" ) );
>
> document.open();
> PdfContentByte content = writer.getDirectContent();
>
> int pages = reader.getNumberOfPages();
> for ( int i = 1; i <= pages; ++i ) {
>
> Rectangle margins = reader.getBoxSize( i, "media" );
>
> Rectangle sizes = reader.getPageSizeWithRotation( i );
>
> Image page = Image.getInstance( writer.getImportedPage( reader, i ) );
> page.setAbsolutePosition( margins.getLeft(), margins.getBottom() );
> page.setRotationDegrees( -sizes.getRotation() );
>
> content.addImage( page );
> document.newPage();
> }
>
> } catch ( Exception e ) {
> System.out.println( e.toString() );
> } finally {
> if ( null != document ) {
> try { document.close(); } catch ( Exception e ) { }
> }
>
> if ( null != writer ) {
> try { writer.close(); } catch ( Exception e ) { }
> }
> }
>
> Any ideas what this could be and if it is the hidden text can anyone point
> me into the right direction to fixing it? All I'm trying to do is create a
> new copy of the PDF with just the content, everything else "stripped" out of

The content in most cases is just text- certainly you would imagine tax 
instructions
to be more concerned with information than formatting and fonts. But, this
becomes and intractable problem ( extracting information ) if the 
document creator was more concerned with appearance. Indeed, there
is no reason for the document to not just be one huge TIFF image. 
I had some luck with pdftotext modified to contain (x,y) locations
for the text but it is still difficult with the IRS forms.


> it. Maybe it would be easier to just remove stuff from the original PDF and
> save that.

If anyone from the IRS or other groups who believe they are publishing 
information
instead of a work of art is reading, please try to design your documents such 
that
computer readable information is available. This is a huge problem with the way
pdf's are often used. 


> --
> View this message in context: 
> http://old.nabble.com/PDF-%22laggy%22-because-of-bleeding--tp26755295p26755295.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> Return on Information:
> Google Enterprise Search pays you back
> Get the facts.
> http://p.sf.net/sfu/google-dev2dev
> _______________________________________________
> iText-questions mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.1t3xt.com/docs/book.php
> Check the site with examples before you ask questions: 
> http://www.1t3xt.info/examples/
> You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
                                          
_________________________________________________________________
Hotmail: Trusted email with powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141665/direct/01/
------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to