[ 
https://issues.apache.org/jira/browse/PDFBOX-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-2894:
--------------------------------
    Description: 
This ties in with my COSStream simplification in PDFBOX-2893.

COSStreamArray is a troublesome abstraction, it's not a real COS object and 
it's the only COS object which can be generated _after_ parsing. Look at the 
implementation of COSStreamArray, most methods throw an exception because it's 
_not_ a COSStream - it violates the contact of the very thing it claims to be. 
Even PDPageContentStream has to use instanceof to "peer through"  the 
abstraction of COSStreamArray.

There's no reason to have this class, other than to duck-tape flaws in 1.8's 
APIs, namely that PDPage#getStream() returns a PDStream and PDFStreamParser 
expects a PDStream, yet both of these may be arrays of streams.

We can fix this in 2.0 by getting rid of the erroneous PDPage#getStream() and 
by exposing the array of streams, rather than attempting to hide them. -This 
will also fix existing errors throughout the codebase which are associated with 
mistaking COSStreamArray for a COSStream.- We can still provide an InputStream 
API which abstracts over the array of streams, because there's nothing wrong 
with that - so users can have the same simple and convenient experience.

An added benefit of doing this is that it will allow us to remove 
SequenceRandomAccessRead, a highly complex memory-holding class.

  was:
This ties in with my COSStream simplification in PDFBOX-2893.

COSStreamArray is a troublesome abstraction, it's not a real COS object and 
it's the only COS object which can be generated _after_ parsing. Look at the 
implementation of COSStreamArray, most methods throw an exception because it's 
_not_ a COSStream - it violates the contact of the very thing it claims to be. 
Even PDPageContentStream has to use instanceof to "peer through"  the 
abstraction of COSStreamArray.

There's no reason to have this class, other than to duck-tape flaws in 1.8's 
APIs, namely that PDPage#getStream() returns a PDStream and PDFStreamParser 
expects a PDStream, yet both of these may be arrays of streams.

We can fix this in 2.0 by getting rid of the erroneous PDPage#getStream() and 
by exposing the array of streams, rather than attempting to hide them. This 
will also fix existing errors throughout the codebase which are associated with 
mistaking COSStreamArray for a COSStream. We can still provide an InputStream 
API which abstracts over the array of streams, because there's nothing wrong 
with that - so users can have the same simple and convenient experience.

An added benefit of doing this is that it will allow us to remove 
SequenceRandomAccessRead, a highly complex memory-holding class.


> Remove COSStreamArray / SequenceRandomAccessRead
> ------------------------------------------------
>
>                 Key: PDFBOX-2894
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2894
>             Project: PDFBox
>          Issue Type: Improvement
>    Affects Versions: 2.0.0
>            Reporter: John Hewson
>            Assignee: John Hewson
>             Fix For: 2.0.0
>
>
> This ties in with my COSStream simplification in PDFBOX-2893.
> COSStreamArray is a troublesome abstraction, it's not a real COS object and 
> it's the only COS object which can be generated _after_ parsing. Look at the 
> implementation of COSStreamArray, most methods throw an exception because 
> it's _not_ a COSStream - it violates the contact of the very thing it claims 
> to be. Even PDPageContentStream has to use instanceof to "peer through"  the 
> abstraction of COSStreamArray.
> There's no reason to have this class, other than to duck-tape flaws in 1.8's 
> APIs, namely that PDPage#getStream() returns a PDStream and PDFStreamParser 
> expects a PDStream, yet both of these may be arrays of streams.
> We can fix this in 2.0 by getting rid of the erroneous PDPage#getStream() and 
> by exposing the array of streams, rather than attempting to hide them. -This 
> will also fix existing errors throughout the codebase which are associated 
> with mistaking COSStreamArray for a COSStream.- We can still provide an 
> InputStream API which abstracts over the array of streams, because there's 
> nothing wrong with that - so users can have the same simple and convenient 
> experience.
> An added benefit of doing this is that it will allow us to remove 
> SequenceRandomAccessRead, a highly complex memory-holding class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to