John Hewson created PDFBOX-2205:
-----------------------------------
Summary: Operator Refactoring
Key: PDFBOX-2205
URL: https://issues.apache.org/jira/browse/PDFBOX-2205
Project: PDFBox
Issue Type: Improvement
Components: PDModel, Rendering
Affects Versions: 2.0.0
Reporter: John Hewson
Fix For: 2.0.0
I'm in the process of porting a fairly complex program which uses the 1.8 API
over to 2.0, as a way of finding out where the rough edges in 2.0 are. The app
which I'm porting hooks into many of the graphics operators and subclasses
PageDrawer to get access to the PDF's graphics state.
It turns out that this doesn't work very well, especially in 2.0 where more of
the PageDrawer's state is private and we have the additional complexity of
transparency groups.
The main issue is that the graphics operators are coupled to PageDrawer, but
I'm not interested in the AWT rendering, I just need a way to hook into the
graphics operations - subclassing the operators has proven to be a poor
solution as there are cases where calling super.process() doesn't provide
enough flexibility.
So here's my solution: in the same way that text processing was recently
factored-out into PDFTextStreamEngine for end-users to subclass, I'd like to do
the same with graphics operations. Instead of the graphics operators being
coupled to PageDrawer, which is only one possible implementation of graphics
handling, we can move the methods which the operators call up into a new
subclass of PDFStreamEngine, let's call it PDFGraphicsStreamEngine. This class
can the be subclassed by anyone interested in hooking into the graphics
operations, including PageDrawer.
With the new callbacks for text handling already in PDFTextStreamEngine and the
addition of new graphics callbacks in PDFGraphicsStreamEngine, most of the time
it shouldn't be necessary for end-users to need to override the operator
classes to get access to the information they need, which would be a huge
benefit :)
This will involve a bunch of changes to operators, so I'll take the chance to
do some general cleaning up while I'm at it: the operator classes haven't
received much attention for a while. With more callbacks in PDFStreamEngine et
al, we're moving towards a point where the operator classes are becoming almost
an internal part of the PDFBox API: might be something to think about for the
future.
--
This message was sent by Atlassian JIRA
(v6.2#6252)