[ https://issues.apache.org/jira/browse/PDFBOX-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311670#comment-14311670 ]
John Hewson edited comment on PDFBOX-2580 at 2/9/15 12:29 AM: -------------------------------------------------------------- Modularising PDFBox is one of the tasks which we have been working on in 2.0, though I wouldn't go as far as to call it a goal, as it doesn't achieve anything in its own right: the actual goals are things such as reducing the number of third party jars which we depend on, having PDFBox run on Android, or avoiding the use of AWT in server environments. A great deal of modularisation has been done in 2.0, and we've pretty much achieved our modularisation goals, with the exception of removing AWT as a dependency, which isn't feasible at this stage. PDFBOX-586 provides a great overview of the modularisation which has been delivered in 2.0: - examples have been moved to their own module - lucene integration has been moved to the examples module - ant integration has been moved to the examples module - moved all command line tools into their own module - moved PDFViewer swing component into the tools module - replaced usage of ICU for complex text extraction with Java's built-in support The benefit of these changes is not the removal of some code from pdfbox core but to reduce the jars which pdfbox core depends on, namely: - lucene - ant - swing - ICU All of which are very large dependencies. These are the kinds of changes which we mean when we refer to modularisation, because they have tangible benefits and help users satisfy goals such as reducing the size of the PDFBox distribution from hundreds to tens of megabytes. It's notable that we still haven't achieved on our our original modularisation goals of text extraction on Android (PDFBOX-586) due to the fact that Android does not support AWT, and PDFBox core and FontBox depend deeply on AWT. In practice it may simply not be practical to remove AWT from PDFBox core without causing significant collateral damage. Even text extraction has deep dependencies on AWT via PD's clipping paths and FontBox's fonts. Returning to the topic of forms, any modularisation needs to be presented in terms of third party dependencies. 1) What jars would be no longer have to depend on in core if we move forms into their own module? 2) Conversely, what pdfbox jars would a user of a separate forms module be able to avoid having dependencies on? was (Author: jahewson): Modularising PDFBox is one of the tasks which we have been working on in 2.0, though I wouldn't go as far as to call it a goal, as it doesn't achieve anything in its own right: the actual goals are things such as reducing the number of third party jars which we depend on, having PDFBox run on Android, or avoiding the use of AWT in server environments. A great deal of modularisation has been done in 2.0, and we've pretty much achieved our modularisation goals, with the exception of removing AWT as a dependency, which isn't feasible at this stage. PDFBOX-586 provides a great overview of the modularisation which has been delivered in 2.0: - examples have been moved to their own module - lucene integration has been moved to the examples module - ant integration has been moved to the examples module - moved all command line tools into their own module - moved PDFViewer swing component into the tools module - replaced usage of ICU for complex text extraction with Java's built-in support The effect benefit of these changes is not the removal of some code from pdfbox core but to reduce the jars which pdfbox core depends on, namely: - lucene - ant - swing - ICU All of which are very large dependencies. These are the kinds of changes which we mean when we refer to modularisation, because they have tangible benefits and help users satisfy goals such as reducing the size of the PDFBox distribution from hundreds to tens of megabytes. It's notable that we still haven't achieved on our our original modularisation goals of text extraction on Android (PDFBOX-586) due to the fact that Android does not support AWT, and PDFBox core and FontBox depend deeply on AWT. In practice it may simply not be practical to remove AWT from PDFBox core without causing significant collateral damage. Even text extraction has deep dependencies on AWT via PD's clipping paths and FontBox's fonts. Returning to the topic of forms, any modularisation needs to be presented in terms of third party dependencies. 1) What jars would be no longer have to depend on in core if we move forms into their own module? 2) Conversely, what pdfbox jars would a user of a separate forms module be able to avoid having dependencies on? > Decouple implementation specific forms handling from interactive.form PD Model > ------------------------------------------------------------------------------ > > Key: PDFBOX-2580 > URL: https://issues.apache.org/jira/browse/PDFBOX-2580 > Project: PDFBox > Issue Type: Improvement > Components: AcroForm > Reporter: Maruan Sahyoun > Assignee: Maruan Sahyoun > Fix For: 2.0.0 > > Attachments: sonar.png > > > The interactive.form PD model currently holds classes reflecting the various > fields intermixed with appearance generation and layout handling. > In order to separate the PD model from the service of forms filling and > appearance generation this functionality shall be moved into a new package. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org