Hi Dom, Zyx

I’ve been looking at PoDoFo memory usage on large documents.

The PDF spec is 8.7 MB on disk, but uses around 200 MB of RAM when loaded into 
a PdfMemDocument
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf

Memory usage is:

850,000 PdfNames  using about 70 MB, which are mostly PdfDictionary keys
125,000 PdfObjects using about 10 MB

A lot of the PdfNames are duplicated dictionary keys appearing in most/all 
objects (e.g. “Kids”, “Length”, “Parent” etc)

Eliminating the duplication should save a lot of memory:


-          Create a single document name table, something like
std::map< std::string , PdfName  > m_nameTable;


-          Change TKeyMap from
typedef std::map<PdfName,PdfObject*>      TKeyMap; // stores PdfName in every 
object key : 36 bytes for sizeof(PdfName) + 24 bytes HeapAlloc overhead + 
PdfName::m_Data.length()
to
typedef std::map<PdfName&,PdfObject*>      TKeyMap; // stores reference (4 or 8 
byte pointer) in every object key



-          When keys are added to a PdfDictionary, add them to the document 
name table if they don’t exist, then add the PdfName& reference to TKeyMap 
(referencing a document name table entry)
This should reduce memory usage for PdfName from 70 MB to about 4MB in 
PDF32000_2008.pdf

Is this worth doing? Can you think of any problems this might cause?

Best Regards
Mark

Mark Rogers - mark.rog...@powermapper.com
PowerMapper Software Ltd - www.powermapper.com
Registered in Scotland No 362274 Quartermile 2 Edinburgh EH3 9GL

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Podofo-users mailing list
Podofo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to