Re: [iText-questions] How can get the spatial extent of PdfLayer by iText

Mark Storer Mon, 27 Sep 2010 10:23:05 -0700

From: 1T3XT info [mailto:[email protected]]  
> On 26/09/2010 4:16, zxgfox110 wrote:
> 
> That's NOT XML.
> Those are key-value pairs in a dictionary.
> 
> What is between << and >> is a dictionary object.
> <</Key/Value>>
> A word preceeded by / is a name object.
> A construct with [ and ] is an array object.
> Whatever is between ( and ) is a PDF String.
> 
> > So,We can get the  Projection.You said can read geopdf by 
> > lowest iText  APIs,could you describe them in detail


The JavaDoc is pretty good.  You'll want to look at PdfDictionary,
PdfArray, PdfNumber, PdfString, PdfName, and maybe PdfBoolean.

http://api.itextpdf.com/

To get at the dictionary in question, you'll have to follow indirect
references back to something PdfReader will hand you directly... A page
or the "root" (AKA "catalog") object.

AKA: Also Known As.

So... That looks a lot like a page object... Though its fragments, not
entire objects.  The "<</contents..." looks like a page definition (the
/CropBox is a Strong Hint).

Lets pretty this up a bit:
<<
  /Contents 11 0 R
  /CropBox [0.0 0.0 122.88 122.88]
  /Group 38 0 R
  /Group1 39 0 R
  /LGIDict[
    <<
      /CTM[(41.6666666667)(0)(0)(41.6666666667)(192752)(3768671)]
      /Description(UTM Zone 17, Northern Hemisphere)
      /Display <<
        /CentralMeridian(-81)
        /Datum(NAS)/
        /Description(UTM Zone 17, Northern Hemisphere)
        /FalseEasting(500000)
        /FalseNorthing(0)
        /OriginLatitude(0)
        /ProjectionType(TC)
        /ScaleFactor(0.9996)
        /Type/Projection>>
        /LGIT:Info 50 0 R
        /Neatline[(0)(0)(0)(122.88)(122.88)(122.88)(122.88)(0)]
        /Projection <<
          /CentralMeridian(-81)
          /Datum(NAS)
          /Description(UTM Zone 17, Northern Hemisphere)
          /FalseEasting(500000)
          /FalseNorthing(0)
          /OriginLati...

This is clearly a fragment... We have several collections that are never
closed, 4 dictionaries and an array.

So in this case, you'd do something like:

// page number, not page index.
PdfDictionary pageDict = myPdfReader.getPageDict(1); 

// not a very good name for an array, is it?
PdfArray lgiArray = pageDict.getAsArray(new PdfName("LGIDict")); 

if (lgiArray != null) {
  PdfDictionary lgiDict = lgiArray.getAsDictionary(0);
  if (lgiDict != null && lgiDict.getAsName(PdfName.TYPE)).equals(new
PdfName("Projection")){
    PdfArray lgiCTM = lgiDict.getAsArray(new PdfName("CTM"));
    doStuffWithCTM(lgiCTM);
    ...
  }
}


Many name objects are defined as public static final in PdfName...
They're used most often as keys in dictionaries, but show up in other
places as well.  Look there first so you don't end up creating
unnecessary temporaries.

The getAs* methods in PdfArray and PdfDictionary are convenience methods
I added.  They will automatically look up indirect references, and
return null if the key/index is missing, or is of a different type.

Indirect references?  Yep, there are several in the PDF fragment you
posted.  For example: "/Content 11 0 R".  That means that the value of
/content is object index 11, generation 0.  Generations are almost
universally zero... Don't worry about them.  In fact, I don't recall
ever seeing anything that wasn't generation 0 in all my years as a PDF
programmer (I started in January of 97, feeling old...).  

Some objects are required to be indirect references (page dictionaries,
all PdfStream's, font dictionaries, etc etc).  Some objects are required
to be direct in the Pdf Specification.  Many are left unspecified and
are legal either way.

Looking at the object structure of a PDF in a text editor is a great way
to figure out how things are laid out.  Another great way is with a PDF
object browser.  iText RUPS, for example.  There's also a commercial
Acrobat plugin called "PDF Can Opener" that has lots of bells and
whistles.  They create tree views of the object structure, and will look
up indirect references for you.  PDF Can Opener will let you look up
various object types in both directions... Pdf dict to the page, or the
page to it's dictionary.  Pretty slick.

PS: The PDF Specification is an iso spec that is available free on
Adobe's site:
...
And they moved it.  How nice of them.

Leonard?  Do you have a new link, or is it Gone?

--Mark Storer
  Senior Software Engineer
  Cardiff.com
 
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
 

------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Re: [iText-questions] How can get the spatial extent of PdfLayer by iText

Reply via email to