Hello Maruan
as we are currently reworking the documentation as well as planning
for PDFBox 2.0 I would like to understand why you think that PDFBox
object hierarchy doesn't match closer with the PDF structure. Maybe
there is room for improvement.
Well, I think it is a representation problem.
I'm used to look in the adobe PDF refence when I'm working with a
document. PdfBox help me to deflate the streams, or do the operator
processing (matrix product can be painfull to do by the hand), but when
I encounter a document wich does not work as expected, I mainly works
with vim and the adobe documentation.
I'm also working with printers, and it is easier to speak about the PDF
structure than speaking about any API.
So I think that I firstly represent myself a PDF document with it
internal structure, and then try to look in the Pdfbox API what are the
corresponding objects.
Pdfbox has great improvement, the PDFStreamEngine is one of them, but I
think it should not differ to much from the data structure its
represent, because the pdf structure is well known, and it is easier to
understand the API if you already know the document structure. But this
is a choice to do, I think there is no good answer, just a choice to
assume…
--
Sébastien Dailly