> On 10 Nov 2015, at 12:10, Maruan Sahyoun <[email protected]> wrote: > >> >> Am 10.11.2015 um 19:19 schrieb John Hewson <[email protected]>: >> >> Correction: That’s how *PDFBox* is designed. >> >>> On 10 Nov 2015, at 10:15, John Hewson <[email protected]> wrote: >>> >>>> >>>> On 10 Nov 2015, at 03:30, Maruan Sahyoun <[email protected]> wrote: >>>> >>>> Hi, >>>> >>>> as discussed on >>>> http://stackoverflow.com/questions/33383389/pdfbox-how-can-a-pdacroform-be-flattened/33489651#33489651 >>>> now we have a flatten() method there is also the need to (re-) generate >>>> the appearances on demand. The same applies if we'd like to flatten >>>> annotations. With the current package and class structure that would go >>>> into PDAcroForm for interactive forms. >>>> >>>> What I'm proposing is - instead of adding to the PD model - have user case >>>> oriented functionality in a new package (services or so) so we have COS >>>> (abstraction of low level PDF elements), PD (abstraction of COS for PDF >>>> elements) and services (application of PD model to 'do' something with the >>>> PDF). As we add higher level functionality this would help us keeping the >>>> PD model clean. >>> >>> You’re under-selling PD here. PD *is* a high-level abstraction, it’s not >>> just a wrapper around COS, look at PDFont for example. PDDocument lets you >>> ‘do’ something with a document, PDPage lets you ‘do’ something with a Page, >>> and PDAcroForm lets you ‘do’ something with an acro form. That’s what PD is >>> all about. >>> >>> The only caveat is that PD is tied to a single document, so we recently >>> introduced the “multipdf” package. But any functionality which manipulates >>> a single PDF should be in PD. That’s how PDF is designed. > > that's how it's currently designed which may or may not be the case moving > forward. And we have a number of tools which work on a single document but > are not part of PD such as ExtractImages, ExtractText or PDFSplit.
ExtractText and ExtractText are command lines tools, so of course they’re in the tools jar - but the logic which powers them is in PD. Same for PDFSplit, for the most part, though that one’s a bit messy. If you’re proposing to add a new command line tool, then follow this pattern, with a wrapper in ‘tools’ and the logic in PD. > Some of them are base on individual packages such as o.a.p.text. So we do > already have cases where functionality is not part of PD (e.g. we could have > had PDDocument.extractText(), PDDocument.split()). Again, text extraction logic is in PD, it’s just a wrapper which is elsewhere. Split is arguably a mess and not something we want to re-create. > As an example we can have PDDocument.flatten() to flatten AcroForms and > Annotations - would be in line with your thoughts and how PDFBox is currently > (mainly) designed. And of course we can add PDDocument.refreshAppearances() … > - my proposal is to not add that there but keep that in a separate class in a > separate package. Actually I was thinking PDAcroForm.flatten(). > With the package name being used for more such (future) additions e.g. > o.a.p.services.appearance, o.a.p.services.signature … We’ve had this exact discussion in the past. Packages *are not services*. APIs *are not services*. Services are daemons, web servers, etc.. APIs do not expose services. — John > BR > Maruan > >>> >>> — John >>> >>>> A similar approach could also be taken e.g. for signing a PDF ... >>>> >>>> WDYT? >>>> >>>> Maruan >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> <mailto:[email protected]> >>>> <mailto:[email protected] >>>> <mailto:[email protected]>> >>>> For additional commands, e-mail: [email protected] >>>> <mailto:[email protected]> <mailto:[email protected] >>>> <mailto:[email protected]>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > <mailto:[email protected]> > For additional commands, e-mail: [email protected] > <mailto:[email protected]>
