[
https://issues.apache.org/jira/browse/TIKA-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716318#comment-14716318
]
Jukka Zitting commented on TIKA-1706:
-------------------------------------
Note that o.a.tika.io is a part of the public API of tika-core, so even if we
restore the commons-io dependency we should keep these classes for backwards
compatibility (perhaps as dummies that just inherit the relevant commons-io
classes or redirect static calls to there).
I don't have a strong opinion here. I do think that the "no dependencies"
principle of tika-core is useful and worth the overhead of a dozen duplicated
classes. And a 30% increase in the tika-core footprint because of the added
dependency would still be non-trivial. On the other hand the argument about
missing out on improvements in commons-io is valid.
Personally I'd start here by checking what exactly has changed in the classes
we duplicate from commons-io. If it's just a few lines then I'd just merge
those changes to Tika and be happy with that for the next five years. If there
are more substantial improvements, switching back to a dependency is probably
worth it.
> Bring back commons-io to tika-core
> ----------------------------------
>
> Key: TIKA-1706
> URL: https://issues.apache.org/jira/browse/TIKA-1706
> Project: Tika
> Issue Type: Improvement
> Components: core
> Reporter: Yaniv Kunda
> Priority: Minor
> Fix For: 1.11
>
>
> TIKA-249 inlined select commons-io classes in order to simplify the
> dependency tree and save some space.
> I believe these arguments are weaker nowadays due to the following concerns:
> - Most of the non-core modules already use commons-io, and since tika-core is
> usually not used by itself, commons-io is already included with it
> - Since some modules use both tika-core and commons-io, it's not clear which
> code should be used
> - Having the inlined classes causes more maintenance and/or technology debt
> (which in turn causes more maintenance)
> - Newer commons-io code utilizes newer platform code, e.g. using Charset
> objects instead of encoding names, being able to use StringBuilder instead of
> StringBuffer, and so on.
> I'll be happy to provide a patch to replace usages of the inlined classes
> with commons-io classes if this is accepted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)