[ 
https://issues.apache.org/jira/browse/PDFBOX-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311670#comment-14311670
 ] 

John Hewson edited comment on PDFBOX-2580 at 2/9/15 12:29 AM:
--------------------------------------------------------------

Modularising PDFBox is one of the tasks which we have been working on in 2.0, 
though I wouldn't go as far as to call it a goal, as it doesn't achieve 
anything in its own right: the actual goals are things such as reducing the 
number of third party jars which we depend on, having PDFBox run on Android, or 
avoiding the use of AWT in server environments.

A great deal of modularisation has been done in 2.0, and we've pretty much 
achieved our modularisation goals, with the exception of removing AWT as a 
dependency, which isn't feasible at this stage. PDFBOX-586 provides a great 
overview of the modularisation which has been delivered in 2.0:

- examples have been moved to their own module
- lucene integration has been moved to the examples module
- ant integration has been moved to the examples module
- moved all command line tools into their own module
- moved PDFViewer swing component into the tools module
- replaced usage of ICU for complex text extraction with Java's built-in support

The benefit of these changes is not the removal of some code from pdfbox core 
but to reduce the jars which pdfbox core depends on, namely:

- lucene
- ant
- swing
- ICU

All of which are very large dependencies. These are the kinds of changes which 
we mean when we refer to modularisation, because they have tangible benefits 
and help users satisfy goals such as reducing the size of the PDFBox 
distribution from hundreds to tens of megabytes.

It's notable that we still haven't achieved on our our original modularisation 
goals of text extraction on Android (PDFBOX-586) due to the fact that Android 
does not support AWT, and PDFBox core and FontBox depend deeply on AWT. In 
practice it may simply not be practical to remove AWT from PDFBox core without 
causing significant collateral damage. Even text extraction has deep 
dependencies on AWT via PD's clipping paths and FontBox's fonts.

Returning to the topic of forms, any modularisation needs to be presented in 
terms of third party dependencies. 1) What jars would be no longer have to 
depend on in core if we move forms into their own module? 2) Conversely, what 
pdfbox jars would a user of a separate forms module be able to avoid having 
dependencies on?


was (Author: jahewson):
Modularising PDFBox is one of the tasks which we have been working on in 2.0, 
though I wouldn't go as far as to call it a goal, as it doesn't achieve 
anything in its own right: the actual goals are things such as reducing the 
number of third party jars which we depend on, having PDFBox run on Android, or 
avoiding the use of AWT in server environments.

A great deal of modularisation has been done in 2.0, and we've pretty much 
achieved our modularisation goals, with the exception of removing AWT as a 
dependency, which isn't feasible at this stage. PDFBOX-586 provides a great 
overview of the modularisation which has been delivered in 2.0:

- examples have been moved to their own module
- lucene integration has been moved to the examples module
- ant integration has been moved to the examples module
- moved all command line tools into their own module
- moved PDFViewer swing component into the tools module
- replaced usage of ICU for complex text extraction with Java's built-in support

The effect benefit of these changes is not the removal of some code from pdfbox 
core but to reduce the jars which pdfbox core depends on, namely:

- lucene
- ant
- swing
- ICU

All of which are very large dependencies. These are the kinds of changes which 
we mean when we refer to modularisation, because they have tangible benefits 
and help users satisfy goals such as reducing the size of the PDFBox 
distribution from hundreds to tens of megabytes.

It's notable that we still haven't achieved on our our original modularisation 
goals of text extraction on Android (PDFBOX-586) due to the fact that Android 
does not support AWT, and PDFBox core and FontBox depend deeply on AWT. In 
practice it may simply not be practical to remove AWT from PDFBox core without 
causing significant collateral damage. Even text extraction has deep 
dependencies on AWT via PD's clipping paths and FontBox's fonts.

Returning to the topic of forms, any modularisation needs to be presented in 
terms of third party dependencies. 1) What jars would be no longer have to 
depend on in core if we move forms into their own module? 2) Conversely, what 
pdfbox jars would a user of a separate forms module be able to avoid having 
dependencies on?

> Decouple implementation specific forms handling from interactive.form PD Model
> ------------------------------------------------------------------------------
>
>                 Key: PDFBOX-2580
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2580
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: AcroForm
>            Reporter: Maruan Sahyoun
>            Assignee: Maruan Sahyoun
>             Fix For: 2.0.0
>
>         Attachments: sonar.png
>
>
> The interactive.form PD model currently holds classes reflecting the various 
> fields intermixed with appearance generation and layout handling.
> In order to separate the PD model from the service of forms filling and 
> appearance generation this functionality shall be moved into a new package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to