Dear Experts, I need to extract the geolocation metadata from a GeoPDF file.
If you're not familiar with this format, it's something that was developed by a company called TerraGo Technologies and was adopted as a "best practice" by the Open Geospatial Consortium. There is a document describing it via http://www.opengeospatial.org/standards/bp (look for "GeoPDF"; click-through but free-looking license required). Basically it provides a method to associate positions in the document with latitude-longitude positions on the ground. The method used is to define "map frames" that are added to the parent PDF page object. As I understand it, there is a new key 'LGIDict' in the page object, which is an array of dictionaries one per map frame, each of which contains a set of entries like containing matrices, bounding boxes etc. that define the geolocation for that frame. My hope is that it would be possible to write a small program that would open the PDF file and iterate through the pages, dumping this data as it is found. I have had a look at the pdfinfo program which seems to be doing something similar for other metadata. Would anyone be able to help me with this? I am a competent C++ coder but have never had to understand much about how PDF works. Is starting with pdfinfo sane? How do I access the page objects? Is there a way to iterate through the entries in a dictionary? Or, maybe there is already some other tool that will do this? Here's an example of a GeoPDF file: ftp://ftp2.cits.rncan.gc.ca/pub/cantopo/50k_pdf/092/g/cantopo_092g06_pdf.zip Many thanks for any advice. Regards, Phil. _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
