[ 
https://issues.apache.org/jira/browse/PDFBOX-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318309#comment-14318309
 ] 

Andrea Vacondio commented on PDFBOX-2677:
-----------------------------------------

First sorry for the reformat, I configured Eclipse to automatically reformat 
with the provided formatting whenever I save... I assumed all the code was 
already formatted and didn't pay attention so the patch I posted now has a lot 
of noise in it.
I have a lot of unit tests but they weren't included, I forgot to add them.

I think we can start with PDFBOX-156. Currently PDOutlineNode is a 
COSObjectable and has a COSDictionary so it basically wraps a COSDictionary 
adding some logic so PDOutlineNode is actually a PDDictionaryWrapper correct?
PDDictionaryWrapper already implements equals and hashCode where two 
PDDictionaryWrapper are equals if they wrap the same dictionary which I think 
makes sense for the outline nodes... it also makes even more sense if you think 
that getFirstChild and getLastChild return a new PDOutlineItem instance every 
time you call them, leading to the weird case described in PDFBOX-156 (because 
the rely on Object.equals and a new instance is created every time)
{code}
PDOutlineItem a = outline.getFirstChild();
PDOutlineItem b = outline.getFirstChild();
assertTrue(a.equals(b)); //this fails 
{code}

About changes to the PDDictionaryWrapper no, they are not necessary but I 
thought they made sense since they are harmless. PDDictionaryWrapper wraps a 
COSDictionary and the getCOSObject will always return a COSDictionary which is 
a covariant return type for COSBase so this:
{code}
    @Override
    public COSBase getCOSObject()
    {
        return this.dictionary;
    }

    protected COSDictionary getCOSDictionary()
    {
        return this.dictionary;
    }
{code}
Can become this:
{code}
    @Override
    public COSDictionary getCOSObject()
    {
        return this.dictionary;
    }
{code}
I thought it might be some leftover from the jdk 1.4 era where covariant 
weren't there but it seems this class was introduced when PDFBox was already 
jdk 1.5... anyway, if you get rid of the getCOSDictionary which (if I'm not 
overlooking anything) is useless, you need to adjust all the places where it 
was called and replace it with getCOSObject which for the PDDictionaryWrapper 
now returns COSDictionary. I think this is the modification that got a little 
intrusive and touched things around. 
It's not necessary for the issue itself, it just cought my eye and I thought 
now is the time to make this kind of changes, I think it's safe but it does 
break the API so it's probably now or the next major... or never :)

I'm attaching a new patch where unit tests are included and PDDictionaryWrapper 
is not modified, this should make it way more focused on the outline package, 
there's no reformatting noise and unit tests should help to clarify the patch 
behaviour. 
PDDictionaryWrapper can be modified later if you think my argument made sense.

> Negative Outlines COUNT and various issues
> ------------------------------------------
>
>                 Key: PDFBOX-2677
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2677
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.0
>         Environment: 2.0.0-SNAPSHOT r1658566
>            Reporter: Andrea Vacondio
>              Labels: outline
>             Fix For: 2.0.0
>
>         Attachments: outline_closed_negative_outlines_count_wrong_last.pdf, 
> outline_closed_patched.pdf, outline_patch.diff, outline_second_patch.diff, 
> outline_wrong_first_has_prev.pdf
>
>
> Hi,
> I started playing with the outline package few days ago mostly writing unit 
> tests but I found the current implementation has a number of issues and 
> started fixing them. I ended up with a quite substantial patch I'm going to 
> attach.
> So far I only addressed the addition of nodes.
> Current implementation suffers of some issues:
> * insertSiblingAfter doesn't set the LAST on its parent in case it's the last 
> node
> * Outlines COUNT can be negative (and often is) despite the spec table 152 
> states "The value cannot be negative."
> * invalid outline hierarchy with wrong COUNT can be created with no warning 
> with code like:
>       {code:java}
>       PDDocumentOutline outline = new PDDocumentOutline();
>         PDOutlineItem root = new PDOutlineItem();
>         root.setTitle("Root");
>         PDOutlineItem child1 = new PDOutlineItem();
>         child1.setTitle("child1");
>         PDOutlineItem child11 = new PDOutlineItem();
>         child11.setTitle("child11");
>         PDOutlineItem child12 = new PDOutlineItem();
>         child12.setTitle("child12");
>         PDOutlineItem child13 = new PDOutlineItem();
>         child13.setTitle("child13");
>         child11.insertSiblingAfter(child12);
>         child12.insertSiblingAfter(child13);
>         child1.appendChild(child12);
>         root.appendChild(child1);
>         outline.appendChild(root);
>       {code}
> I'm going to attach a couple of generated pdf showing the mentioned issues.
> The patch addresses:
> * PDFBOX-156
> * PDFBOX-996
> * PDFBOX-1209
> * Outlines COUNT is always positive  (table 152 "Total number of visible 
> outline items at all levels of the outline")
> * Outline items COUNT is correctly calculated
> * NEXT, PREV, LAST and FIRST are are consistently set
> * unit tested
> I added a restriction to methods adding nodes (children or siblings) that 
> throws an exception in case the node is already part of a list (i.e it has a 
> prev or a next) so basically with the patch you cannot create a list of nodes 
> and then add one of them as last child of another node but you'll have to 
> first add the first child and then add siblings to the child. 
> Those scenarios are currently silently accepted creating invalid outlines 
> (see the attached outline_wrong_first_has_prev.pdf) and would require quite 
> some work to handle all of them so, I'm not sure it's the best solution but 
> at least the user is guided with a message instead of creating an outline 
> where the FIRST item has a PREV.
> It breaks the current API and it might be reworked to be less intrusive but 
> since you are on 2.0.0-SNAPSHOT I thought it was ok.
> Please let me know if there's anything I can help with.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to