I wrote this code for you to do this task in a different way:

        PDDocument doc = PDDocument.load("c:/temp/2013_DA_Schmitz.pdf");
        PDPage p =
doc.getDocumentCatalog().getDocumentOutline().getFirstChild().findDestinationPage(doc);
        List<PDPage> pages = doc.getDocumentCatalog().getAllPages();
        for (int i=0; i<pages.size(); i++)
            if (pages.get(i).equals(p))
                System.out.println(i);
        doc.close();

This looks up the page number of the first bookmark in the file, and it
returns 14 (remember it's 0-based).

Gilad


On Sat, Nov 9, 2013 at 12:51 PM, Sera <[email protected]> wrote:

> My main goal is to extract the chapternames, the pagecount of each chapter
> and a way to see, if something new was written in the chapter.
> Further, I want to extract the bullet points inside the PDF, but thats not
> so relevant. I've got the chapternames out of PDFBox. So that works.
> The "see if somethings new" I wanted to make with counting the characters.
>
>
> Am 09.11.2013, 10:53 Uhr, schrieb Gilad Denneboom <
> [email protected]>:
>
>
>  Is writing the code a part of your thesis, or extracting the "chapters"?
>> If
>> the latter, have you considered doing it with JavaScript in Acrobat
>> instead
>> of using Java?
>>
>>
>> On Sat, Nov 9, 2013 at 10:00 AM, Sera <[email protected]> wrote:
>>
>>  https://www2.swc.rwth-aachen.de/docs/2013_DA_Schmitz.pdf
>>>
>>> This would be a sample. It was made with LateX and consists of more than
>>> one .tex file.
>>>
>>> Hope it can help. It's for my bachelor thesis and otherwise I'm lost :(
>>>
>>> BR
>>> Sera
>>>
>>>
>>> Am 09.11.2013, 09:32 Uhr, schrieb Maruan Sahyoun <[email protected]
>>> >:
>>>
>>>
>>>  Hi Sera,
>>>
>>>>
>>>> if the bookmarks do nor relate to pages they can not be taken as a hint
>>>> for splitting.
>>>>
>>>> Is it possible to upload a sample PDF at a public location so we can
>>>> take
>>>> a look at a sample file. Might give us another idea to handle your
>>>> requirement.
>>>>
>>>> BR
>>>> Maruan
>>>>
>>>> Am 08.11.2013 um 17:24 schrieb Sera <[email protected]>:
>>>>
>>>>  Well then. I've got another idea.
>>>>
>>>>> Actually, I don't need the exakt pagenumber, but the pagecount of each
>>>>> chapter.
>>>>> Is it still possible to devide the PDF by it's bookmarks or would'nt
>>>>> that work as well?
>>>>> When I've devided them, I can just make doc.getNumberOfPages(). That
>>>>> works here.
>>>>>
>>>>> Am 08.11.2013, 17:19 Uhr, schrieb Gilad Denneboom <
>>>>> [email protected]>:
>>>>>
>>>>>  Yes, that could very well be the cause...
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 8, 2013 at 4:51 PM, Sera <[email protected]> wrote:
>>>>>>
>>>>>>  Could it be a problem of latex?
>>>>>>
>>>>>>> I'm using it do generate the pdf.
>>>>>>>
>>>>>>> Am 08.11.2013, 16:40 Uhr, schrieb Sera <[email protected]>:
>>>>>>>
>>>>>>>
>>>>>>> First, thanks for the code!
>>>>>>>
>>>>>>>  Unfurtanately, I still get a Nullpointer.
>>>>>>>> dests.getNames() is null.
>>>>>>>>
>>>>>>>> Am 04.11.2013, 13:38 Uhr, schrieb Gilad Denneboom <
>>>>>>>> [email protected]>:
>>>>>>>>
>>>>>>>> You wrote the following code to do it:
>>>>>>>>
>>>>>>>>
>>>>>>>>>    public static int getPageNumberFromNamedDestination(PDDocument
>>>>>>>>> doc,
>>>>>>>>> String name) throws IOException {
>>>>>>>>>        PDDestinationNameTreeNode dests =
>>>>>>>>> doc.getDocumentCatalog().getNames().getDests();
>>>>>>>>>        if (dests==null || dests.getNames()==null)
>>>>>>>>>            return -1;
>>>>>>>>>        Object d = dests.getNames().get(name);
>>>>>>>>>        if (d==null)
>>>>>>>>>            return -1;
>>>>>>>>>        return getPageDestPageNumber(d);
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>>    public static int getPageDestPageNumber(Object dest) {
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitDestination) {
>>>>>>>>>            PDPageFitDestination pageFitDestination =
>>>>>>>>> (PDPageFitDestination) dest;
>>>>>>>>>            return pageFitDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageXYZDestination) {
>>>>>>>>>            PDPageXYZDestination pageXYZDestination =
>>>>>>>>> (PDPageXYZDestination) dest;
>>>>>>>>>            return pageXYZDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitWidthDestination) {
>>>>>>>>>            PDPageFitWidthDestination fitWidthDestination =
>>>>>>>>> (PDPageFitWidthDestination) dest;
>>>>>>>>>            return fitWidthDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitHeightDestination) {
>>>>>>>>>            PDPageFitHeightDestination fitHeightDestination =
>>>>>>>>> (PDPageFitHeightDestination) dest;
>>>>>>>>>            return fitHeightDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        if (dest instanceof PDPageFitRectangleDestination) {
>>>>>>>>>            PDPageFitRectangleDestination
>>>>>>>>> pageFitRectangleDestination
>>>>>>>>> =
>>>>>>>>> (PDPageFitRectangleDestination) dest;
>>>>>>>>>            return pageFitRectangleDestination.findPageNumber();
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        return -1;
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Nov 3, 2013 at 1:39 PM, Sera <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> I've debugged it and it throws an exception.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> PDDestinationNameTreeNode node = (PDDestinationNameTreeNode)
>>>>>>>>>> document.getDocumentCatalog().getStructureTreeRoot().getIDTree();
>>>>>>>>>>
>>>>>>>>>> any idea what the correct way is?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am 01.11.2013, 23:47 Uhr, schrieb Sera <[email protected]>:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> is this the right way to get to the treenode?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Am 31.10.2013, 11:28 Uhr, schrieb Gilad Denneboom <
>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>
>>>>>>>>>>> If the destination is a PDNamedDestination object, you have to
>>>>>>>>>>> cast
>>>>>>>>>>> it to
>>>>>>>>>>>
>>>>>>>>>>>  that class...
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  On Thu, Oct 31, 2013 at 11:24 AM, Sera <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Do I have to cast Action to another type than ActionGoTo? I
>>>>>>>>>>>> don't
>>>>>>>>>>>> see a
>>>>>>>>>>>>
>>>>>>>>>>>>  function getNamedDestination() in the suggestions for my
>>>>>>>>>>>> objects.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 31.10.2013, 10:45 Uhr, schrieb Gilad Denneboom <
>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ah, so your bookmarks are not pointing to page locations
>>>>>>>>>>>>> directly,
>>>>>>>>>>>>> but
>>>>>>>>>>>>> to
>>>>>>>>>>>>>
>>>>>>>>>>>>> Named Destinations. This makes things more complex. You can use
>>>>>>>>>>>>>
>>>>>>>>>>>>>  getNamedDestination() to get the name of the ND the bookmark
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> pointing
>>>>>>>>>>>>>> to. Of course, then you still need to write a function that
>>>>>>>>>>>>>> looks up
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> specific ND in the tree (a PDDestinationNameTreeNode object)
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> then
>>>>>>>>>>>>>> figures out to which page it's pointing to by its value.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Oct 31, 2013 at 10:35 AM, Sera <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> when i make it toString() i get:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  org.apache.pdfbox.pdmodel.****interactive.****
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> documentnavigation.**
>>>>>>>>>>>>>>> destination.****PDNamedDestination@505484dc
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> whereas the last after @ is always different. I think its the
>>>>>>>>>>>>>>> hashed
>>>>>>>>>>>>>>> destination?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am 31.10.2013, 10:20 Uhr, schrieb Gilad Denneboom <
>>>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What do you mean by "hascode", exactly?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  On Thu, Oct 31, 2013 at 10:16 AM, Sera <[email protected]>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ok, now I've got the destination as a hashcode. How do I get
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> pagenumber from this?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  Am 30.10.2013, 20:10 Uhr, schrieb Gilad Denneboom <
>>>>>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Like I said, you need to determine (using instanceof, for
>>>>>>>>>>>>>>>>> example)
>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> actual class it is, one of the subsets of PDAction, like
>>>>>>>>>>>>>>>>> PDActionGoTo
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Oct 30, 2013 at 7:51 PM, Sera <[email protected]>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> current.getAction() is just a PDAction. From there I don't
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> access
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> getDestination().
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Am 30.10.2013, 16:27 Uhr, schrieb Gilad Denneboom <
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  [email protected]>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> You should get the Action of the bookmark, and then check
>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> type
>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> action it is (probably PDActionGoTo), and from the Action
>>>>>>>>>>>>>>>>>>> you'll
>>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> access to the Destination.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Oct 30, 2013 at 4:00 PM, Sera <[email protected]
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hello!
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I need to extract the pagenumber out of the bookmarks
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> tried
>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>  PDOutlineItem current = bookmark.getFirstChild();
>>>>>>>>>>>>>>>>>>>>> PDDestination destination = null;
>>>>>>>>>>>>>>>>>>>>> destination = current.getDestination();
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> But the destination stays null. Any ideas on how to fix
>>>>>>>>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> Sera
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Erstellt mit Operas E-Mail-Modul:
>>>>>>>>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Erstellt mit Operas E-Mail-Modul:
>>>>>>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  Erstellt mit Operas E-Mail-Modul:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  Erstellt mit Operas E-Mail-Modul:
>>>>>>>>>>>>>> http://www.opera.com/mail/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>  --
>>>>>>>>>>>
>>>>>>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> --
>>>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>>>
>>>>>
>>>>
>>>>
>>> --
>>> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>>>
>>>
>
> --
> Erstellt mit Operas E-Mail-Modul: http://www.opera.com/mail/
>

Reply via email to