Hi,

I'm one of the developper of the GDAL library (http://gdal.org) that reads 
various raster & 
vector formats, mostly geospatial, including PDF and its georeferencing 
extensions (either 
expressed wtih Adobe Supplement to ISO 32000 or with Open Geospatial Consortium 
Best 
Practice:
https://portal.opengeospatial.org/files/?artifact_id=40537 )

Currently we use the Poppler internal C++ API and regularly must adjust for 
changes in it. 
Recently we had to do adjustments to accomodate for Poppler 0.58 changes. 
Supporting 
multiple Poppler versions begin to make our code ugly. So I and packagers from 
Linux 
distribution are wondering if there would be a way to access a more stable C++ 
API

Besides rendering as image, we need really low-level access to PDF objects, to 
be able to 
parse georeferencing objects, retrieve layers, turn on/off OCG, or even access 
streams to 
decode drawing instructions so as to build vector objects

I've tried to summarize below our current use of Poppler C++ API. I probably 
missed a few 
calls, but you should get the overall picture:
- Object class: getType(), getTypeName(), getBool(), getInt(), getReal(), 
getString(), 
getName(), getStream(), getArray()
- Dict class: lookupNF(), lookup(), getLength(), getKey()
- Array class: getLength(), getNF(), get()
- Stream class: getDict(), reset(), getChar(), fillGooString()
- Catalog class: getPage(), getPageRef(), readMetadata()
- GooString: getCString(), getLength()
- Ref class: access to num and gen
- PDFDoc class: isOk(), displayPageSlice(), getCatalog(), 
getOptContentConfig(), 
getNumPages(), getDocInfo(), getErrorCode(), str private member(accessed 
through a ugly 
"#define private public" before including poppler! we need to access it to be 
able to delete it 
with our heap since we allocated a stream object provided to PDFDoc() 
constructor. this is to 
avoid potential problems on Windows with cross-heap issues)
- Page class: isOk(), pageObj private member (accessed through a ugly "#define 
private 
public" before including poppler!), getMediaBox()
- OCGs class: isOk(), getOCGs()
- GooList class: getLength(), get()
- OptionalContentGroup class: setState()
- SplashBitmap class: getBitmap(), getWidth(), getHeigh(), getDataPtr(), 
getAlphaPtr(), 
getAlphaRowSize(), getRowSize()
- SplashOutputDev class: we subclass this class and override all/most virtual 
methods to be 
able to turn on/off rendering of various elements as we offer options to render 
selectively 
vector, raster and/or text elements (so basically just a conditional test to 
decide whether to 
return as a no-op or call the base implementation)
- BaseStream class: we subclass this class to use GDAL own I/O abstraction 
layer (which 
beyond regular files can read in .zip files, in-memory files, files available 
through HTTP, etc...). 
So we implement copy(), makeSubStream(), getPos(), getStart(), setPos(), 
moveStart(), 
getKind(), getFileName(), getChar(), makeSubStream(), lookChar(), reset(), 
unfilteredReset(), close(), hasGetChars(), getChars()
- GlobalParams class: setPrintCommands()
- setErrorCallback() function

If you want to glance at the code, the most relevant files are:
https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfobject.cpp
https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfio.cpp
https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfdataset.cpp

I'm not clear if that would be feasible for Poppler to provide a more stable 
API for our use. At 
least, this makes you aware of external users of this API.

Best regards,

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
_______________________________________________
poppler mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to