Re: [iText-questions] Performance when flattening form fields

Mike Marchywka Mon, 26 Apr 2010 03:28:49 -0700









----------------------------------------
> Date: Sun, 25 Apr 2010 22:14:02 -0700
> From: forum_...@trumpetinc.com
> To: itext-questions@lists.sourceforge.net
> Subject: Re: [iText-questions] Performance when flattening form fields
>
>
> After more digging, I'm wondering if the place to do this wouldn't be in the
> PdfCopy.PageStamp class? It seems like PageStamp.alterContents() could do
> the same flattening operation that PdfStamper does.
>
> The ideal would be to factor out the behavior so the code isn't duplicated
> in both PdfCopy and PdfStamper...

I guess I have the larger question of exactly what parsing is?
That is, it seem generally you use itext to 1) read in somthing, often
an existing pdf, 2) do some stuff, then 3) write out a pdf. Presumably
as you go through step 2, you are assembling or compiling a bunch
of structures that allow you to do step 3 but are more optimized
for manipulation and editing the nascent PDF. 
If I understand your earlier comments, you apparently don't actually
have a generic PDF parser to do step 1 that works with all sequences
you could put into step 2. Now, of course, more generally the
above approach doesn't scale as you would always hope to stream
to some extent- read what you need, write what you can etc. 
However, that could probably be hidden somewhat into the implementation
for classes for each step. 
So, instead of things like PdfCoolFeature.doSomething(byte[] pdffile)
you have PdfCoolFeadture.doSomething( ParsedPdfOperand pdflikething)
where the second signature take a parameter that is generally
optimized for a broad class of common operations. 
 
>
> Does anyone see any technical issues with this as a strategy?
>
> - K
>
>
> 'Kevin Day' wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>> I've been doing some digging into the performance question that Giovanni
>> Azua has posted about.
>>  
>> Some of his findings (using StringBuilder, etc...) are solid improvements
>> to overall iText performance - however, the crux of the performance
>> difference he is seeing between iText and the competing solution is not
>> low level.  It's a high level issue.
>>  
>> Here's what's going on:
>>  
>> His specific use case involves stamping headers and footers onto
>> pages.  The footer contains AcroFields that must be flattened prior
>> to stamping.
>>  
>> The performance hit is coming from the fact that, in order to flatten and
>> apply the footer, he is having to:
>>  
>> 1.  Construct a PDF using PdfStamper
>> 2.  Write output to a byte array output stream
>> 3.  Re-parse the BAOS into a PdfReader
>> 4.  Import the page from the reader for use as a stamp
>>  
>> While this is functional, it is certainly not performant.
>>  
>> A much, much faster technique would be to do the flattening to the
>> *reader*, then just import the page to the output writer.  This
>> avoids the awkward creation of the temporary PdfReader.
>>  
>>  
>> So, the performance delta is not caused so much by iText's low level
>> implementation (although the performance improvements that Giovanni has
>> suggested will help to make iText even faster than it already is) - the
>> delta is really caused by an awkward operation forced on the user by the
>> framework.
>>  
>>  
>> So, are there any fundamental reasons to not do flattening, etc... to the
>> PdfReader?  My first look at the code indicates that it may be
>> possible to factor this out of PdfStamper (basically, instead of adjusting
>> the AcroFields dictionary and content streams in the
>> PdfStamper/PdfCopy/etc... output, we'd make those adjustments to the
>> PdfReader).
>>  
>> I'm thinking of something along the lines of:
>>  
>> PdfFormFlattener(PdfReader).flatten(pageNumber)
>>  
>> Maybe with supplemental methods for flattenNamedFields(pageNumber),
>> flattenFieldsOfType(pageNumber)
>>  
>> Thoughts?
>>  
>> - K
>>  
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> iText-questions mailing list
>> iText-questions@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/itext-questions
>>
>> Buy the iText book: http://www.itextpdf.com/book/
>> Check the site with examples before you ask questions:
>> http://www.1t3xt.info/examples/
>> You can also search the keywords list:
>> http://1t3xt.info/tutorials/keywords/
>>
>
> --
> View this message in context: 
> http://old.nabble.com/Performance-when-flattening-form-fields-tp28357673p28360908.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.itextpdf.com/book/
> Check the site with examples before you ask questions: 
> http://www.1t3xt.info/examples/
> You can also search the keywords list: http://1t3xt.info/tutorials/keywords/  
>                                   
_________________________________________________________________
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3
------------------------------------------------------------------------------
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] Performance when flattening form fields

Reply via email to