Thank you Adam for the explanation, and thanks Kevin for the code. As Adam says it's high level of precision ... But Kevin did a nice work here.
Best regards , Hesham --------------------------------------------- Included message : > One thing I haven't been able to get is the actual name of the named > destinations. You would think there would be something like getName() > setName() methods. I'll keep poking around. > > > On Fri, Jun 18, 2010 at 1:23 PM, Kevin Brown <[email protected]> wrote: > >> This aint pretty. It's just a proof of concept written by a Java hacking >> fool. I'm going to refactor if I get approval. :) >> >> Basically you need to start with this (document is a PDDocument): >> >> >> PDDocumentCatalog log = document.getDocumentCatalog(); >> >> PDDocumentNameDictionary dic = log.getNames(); >> PDDestinationNameTreeNode dests = dic.getDests(); >> >> Okay, so the destinations will be available via the getNames() method of >> dests, if they exist at that level. If they do not, you need to bore down by >> using the getKids() method, and check those items for names. If those don't >> work, bore down, etc. Here I print out all the named destinations, and >> change the zoom property, for a particular type of PDF; you may need to >> land at a different level of PDDestinationNameTreeNode. >> >> The PDPageXYZDestination class is the thing to keep an eye on. >> >> private static boolean hasDests(PDDestinationNameTreeNode dests) throws >> IOException { >> if (dests.getNames() == null) { >> List kids = dests.getKids(); >> List<PDNameTreeNode> destList = kids; >> System.out.println("No. of kids: " + destList.size()); >> boolean val = false; >> for (PDNameTreeNode dest2 : destList) { >> System.out.println("in for"); >> Map<?, ?> destList2 = dest2.getNames(); >> // System.out.println("getnames... " + dest2.getNames()); >> >> if (destList2 != null) { >> System.out.println("we have some dests now..."); >> >> Collection<PDPageXYZDestination> values = >> (Collection<PDPageXYZDestination>) destList2.values(); >> Collection<PDPageXYZDestination> c = values; >> System.out.println(c.size()); >> for (PDPageXYZDestination destNode : c) { >> System.out.println(destNode.toString()); >> >> System.out.println("tostring: " + >> destNode.toString()); >> System.out.println("getleft: " + >> destNode.getLeft()); >> System.out.println("getpagenumber" + >> destNode.getPageNumber()); >> System.out.println("gettop: " + destNode.getTop()); >> System.out.println("getzoom: " + >> destNode.getZoom()); >> System.out.println("getpagenumber: " + >> destNode.getPage()); >> System.out.println("getcosarray: " + >> destNode.getCOSArray()); >> System.out.println("find page number: " + >> destNode.findPageNumber()); >> System.out.println("getcosobject: " + >> destNode.getCOSObject()); >> >> >> System.out.println("********************************************"); >> destNode.setZoom(-1); >> val = true; >> } >> >> System.out.println("next in for..."); >> } else { >> >> System.out.println("no names!"); >> } >> >> } >> System.out.println("returning ... " + val); >> return val; >> } else { >> System.out.println("returning false..."); >> return false; >> >> } >> } >> >> >> On Fri, Jun 18, 2010 at 12:09 PM, <[email protected]> wrote: >> >>> Hesham, >>> A named destination is a specific type of node in the document >>> outline (aka a specific type of bookmark). A "normal" bookmark will point >>> to a page (via the page's object ID and revision) while a named >>> destination will point to some name. Then that needs to be resolved >>> (somehow?) to the specific object that it points to. The object could be >>> a page, an image, a paragraph, word, or anything else you'll find in a >>> PDF. The main difference is that you can jump to specific point in a page >>> (e.g. page 3 halfway down where a specific paragraph begins) instead of >>> just pointing to the page. I've never needed this level of precision and >>> references to pages are simpler and more common (and slightly more >>> efficient since it points directly to a page instead of pointing to a >>> pointer to an object), so I have not yet used named destinations. I hope >>> this helps explain the differences and why one would be chosen over the >>> other. >>> >>> Kevin, >>> I too am interested in how you did this, as I expect I'll have to cross >>> this bridge at some point. >>> >>> ---- >>> Thanks, >>> Adam >>> >>> >>> >>> >>> >>> From: >>> "Hesham G." <[email protected]> >>> To: >>> <[email protected]> >>> Date: >>> 06/17/2010 20:14 >>> Subject: >>> Re: getting page numbers of named destinations >>> >>> >>> >>> Kevin , >>> >>> I have been watching this post out, and I don't seem to understand yet >>> what is the difference between "Named destinations" and "bookmarks" ? >>> And I hope if you could share your code with us for how you have got it. >>> >>> >>> Best regards , >>> Hesham >>> >>> >>> --------------------------------------------- >>> Included message : >>> >>> > Thank you SO MUCH, Adam. With your advice, and some tinkering with >>> PDFBox, I >>> > was able to get at and manipulate the named destinations. >>> > >>> > On Tue, Jun 1, 2010 at 3:23 PM, <[email protected]> wrote: >>> > >>> >> Kevin, >>> >> >>> >> Section 7.7.2 of the PDF Spec (I'm referencing version 1.7) goes over >>> the >>> >> Document catalog and table 20 points you to section 12.3.2.3 for "Named >>> >> Destinations"). Section 12.3.2.3 explains that "the correspondence >>> >> between name objects and destinations shall be defined by the Dests >>> entry >>> >> in the document catalogue (see 7.7.2, "Document Catalog")." So, per >>> the >>> >> spec, the answer lies in the document catalog. >>> >> >>> >> Table 28 defines what entries are allowed in the catalog dictionary. >>> The >>> >> "Document Outline" (i.e. key: "Outlines"), which are also known as >>> >> bookmarks, are what you are looping through in your code. So that's >>> why >>> >> you're getting bookmarks and not named destinations. You don't want >>> >> document.getDocumentCatalog().getDocumentOutline() but you do want >>> >> document.getDocumentCatalog(). >>> >> >>> >> Like I said before, I haven't actually dealt with named destinations, >>> nor >>> >> have I even seen a document which uses them, so I don't know how the >>> >> "Dests" key works. However, it should be pretty easy to figure if you >>> >> take a look at a PDF in vi, Notepad++, or any other quality editor. >>> Once >>> >> you know what you're looking for, it's just a matter of looking at >>> things >>> >> in PDDocumentCatalog to find it. >>> >> >>> >> ---- >>> >> Thanks, >>> >> Adam >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> From: >>> >> Kevin Brown <[email protected]> >>> >> To: >>> >> [email protected] >>> >> Date: >>> >> 05/25/2010 11:11 >>> >> Subject: >>> >> Re: getting page numbers of named destinations >>> >> >>> >> >>> >> >>> >> Thanks much! I'm trying this but it seems to pull in bookmarks, not >>> named >>> >> destinations. Am I missing something? >>> >> >>> >> >>> >> On Sat, May 22, 2010 at 11:15 AM, Andreas Lehmkuehler >>> >> <[email protected]>wrote: >>> >> >>> >> > Hi >>> >> > >>> >> > Kevin Brown schrieb: >>> >> > >>> >> > I can't seem to get this done with pdfbox. There doesn't seem to be >>> a >>> >> way >>> >> >> to >>> >> >> get the page number from the context of the named destination. Am I >>> >> wrong? >>> >> >> Anyone got any sample code for working with named destinations? >>> >> >> >>> >> > If you have a look at the mentioned example you find some code like >>> >> this: >>> >> > >>> >> > PDDocumentOutline >>> >> > bookmarks=document.getDocumentCatalog().getDocumentOutline(); >>> >> > PDOutlineItem item = bookmarks.getFirstChild().getNextSibling(); >>> >> > >>> >> > And the PDOutlineItem class provides a method to get the >>> corresponding >>> >> > page: >>> >> > >>> >> > /** >>> >> > * This method will attempt to find the page in this PDF document >>> that >>> >> this >>> >> > outline points to. >>> >> > * If the outline does not point to anything then this method will >>> >> return >>> >> > null. If the outline >>> >> > * is an action that is not a GoTo action then this methods will >>> throw >>> >> the >>> >> > OutlineNotLocationException >>> >> > * >>> >> > * @param doc The document to get the page from. >>> >> > * >>> >> > * @return The page that this outline will go to when activated or >>> null >>> >> if >>> >> > it does not point to anything. >>> >> > * @throws IOException If there is an error when trying to find the >>> >> page. >>> >> > */ >>> >> > public PDPage findDestinationPage( PDDocument doc ) throws >>> IOException >>> >> > >>> >> > I didn't test it, but theoretically it looks like the piece of code >>> you >>> >> are >>> >> > looking for. >>> >> > >>> >> > BR >>> >> > Andreas Lehmkühler >>> >> > >>> >> > >>> >> > >>> >> >> On Wed, May 19, 2010 at 12:45 PM, Kevin Brown <[email protected]> >>> wrote: >>> >> >> >>> >> >> Thanks. I do need to do that! >>> >> >>> >>> >> >>> At the moment I'm trying to see if I can get the >>> >> >>> GotoSecondBookmarkOnOpen.java sample has any clues... if >>> PDOutlineItem >>> >> >>> could refer to a named destination then I may be in business! >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> On Wed, May 19, 2010 at 11:57 AM, <[email protected]> wrote: >>> >> >>> >>> >> >>> I haven't dealt with named destinations, but if you get get the >>> >> object >>> >> >>>> ID >>> >> >>>> of the page, you can look up the page number with >>> doc.getPageMap(). >>> >> If >>> >> >>>> you haven't already, I'd suggest tracing through a PDF with a hex >>> >> editor >>> >> >>>> (or any good quality text editor will work fine) to find out how >>> >> >>>> everything is connected. >>> >> >>>> >>> >> >>>> ---- >>> >> >>>> Thanks, >>> >> >>>> Adam >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> From: >>> >> >>>> Kevin Brown <[email protected]> >>> >> >>>> To: >>> >> >>>> [email protected] >>> >> >>>> Date: >>> >> >>>> 05/19/2010 08:42 >>> >> >>>> Subject: >>> >> >>>> getting page numbers of named destinations >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> Is it possible to, for a PDF, get the named destinations in it, >>> and >>> >> find >>> >> >>>> out >>> >> >>>> what page each is on? It doesn't look like it from my perusal of >>> the >>> >> >>>> documentation, but I'm not sure. Seems like you can get the >>> >> destination >>> >> >>>> names but that's about it. >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> ? Click here to submit conditions >>> >> >>>> >>> >> >>>> This email and any content within or attached hereto from Sun >>> West >>> >> >>>> Mortgage Company, Inc. is confidential and/or legally privileged. >>> >> The >>> >> >>>> information is intended only for the use of the individual or >>> entity >>> >> >>>> named >>> >> >>>> on this email. If you are not the intended recipient, you are >>> hereby >>> >> >>>> notified that any disclosure, copying, distribution or the taking >>> of >>> >> any >>> >> >>>> action in reliance on the contents of this email information is >>> >> strictly >>> >> >>>> prohibited, and that the documents should be returned to this >>> office >>> >> >>>> immediately by email. Receipt by anyone other than the intended >>> >> >>>> recipient is >>> >> >>>> not a waiver of any privilege. Please do not include your social >>> >> >>>> security >>> >> >>>> number, account number, or any other personal or financial >>> >> information >>> >> >>>> in >>> >> >>>> the content of the email. Should you have any questions, please >>> call >>> >> >>>> (800) >>> >> >>>> 453 7884. >>> >> >>>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >> >>> >> > >>> >> >>> >> >>> >> >>> >> ? Click here to submit conditions >>> >> >>> >> This email and any content within or attached hereto from Sun West >>> >> Mortgage Company, Inc. is confidential and/or legally privileged. The >>> >> information is intended only for the use of the individual or entity >>> named >>> >> on this email. If you are not the intended recipient, you are hereby >>> >> notified that any disclosure, copying, distribution or the taking of >>> any >>> >> action in reliance on the contents of this email information is >>> strictly >>> >> prohibited, and that the documents should be returned to this office >>> >> immediately by email. Receipt by anyone other than the intended >>> recipient is >>> >> not a waiver of any privilege. Please do not include your social >>> security >>> >> number, account number, or any other personal or financial information >>> in >>> >> the content of the email. Should you have any questions, please call >>> (800) >>> >> 453 7884. >>> >> >>> > >>> >>> >>> ? Click here to submit conditions >>> >>> This email and any content within or attached hereto from Sun West >>> Mortgage Company, Inc. is confidential and/or legally privileged. The >>> information is intended only for the use of the individual or entity named >>> on this email. If you are not the intended recipient, you are hereby >>> notified that any disclosure, copying, distribution or the taking of any >>> action in reliance on the contents of this email information is strictly >>> prohibited, and that the documents should be returned to this office >>> immediately by email. Receipt by anyone other than the intended recipient is >>> not a waiver of any privilege. Please do not include your social security >>> number, account number, or any other personal or financial information in >>> the content of the email. Should you have any questions, please call (800) >>> 453 7884. >>> >> >> >

