[
https://issues.apache.org/jira/browse/TIKA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033853#comment-18033853
]
Vladimir Sitnikov commented on TIKA-4532:
-----------------------------------------
{quote} What I'm really wondering about, is that you're an apache member and
are arguing not to use an apache product. Is your argument part of something
bigger, e.g. to retire commons lang, or to split it in smaller entities?{quote}
I agree it might sound strange.
Technically, I'm not Apache member. I'm a member of PMC for JMeter and Caclite,
however, I'm not Apache member as in https://www.apache.org/foundation/members.
Here's a broader picture:
A) log4shell and friends were effectively caused by log4j.jar having all the
features. If only there were separate jars like log4j-basic, log4j-jndi,
log4j-chainsaw, then the users could include only the jars they need and they
could get away of most of the CVEs.
B) The current state of commons-lang, commons-compress is that they ship a
single jar. Many projects use only StringUtils and one-two other classes.
Having a single jar might sound a good idea in the past, however, it is a bad
design nowadays due to security reasons.
C) There's *no way of splitting a jar* into several ones since *Maven can't
handle it*.
I've raised a request to split {{commons-compress}} into several jars (see
https://lists.apache.org/thread/bwhsonqdq1f57hrfz3l6wy1gxrtssbsc), and commons
devs declined the split as they said the only way they could do that is by
changing both package names and the artifact ids.
Commons devs refer to https://jlbp.dev/JLBP-6 which boils down to a
Maven-specific issue.
Maven is still popular, so it is indeed important to take its known issues
seriosly.
I've raised a question on Maven dev list with full reproducer, however, I got
no feedback from Maven devs:
https://lists.apache.org/thread/q0bn38qtxkv4orx6o9lhtonjcxkbtw5f
Ideally I would appreciate if Maven devs could consider the issue as a true bug
in Maven, however, they don't acknowledge it even with full reproducer at hand.
{quote}Is your argument part of something bigger, e.g. to retire commons lang,
or to split it in smaller entities?{quote}
It would be great if {{commons-lang}} was split into smaller packages.
For instance, {{commons-stringutils}} so the ones who need
{{StringUtils.isEmpty}} could use it.
However, based on my discussions on {{dev@commons}} and {{dev@maven}} lists, I
think it would be easier to remove {{commons-lang}} altogether rather than
waiting for commons team to split the jar. Maven (even unreleased Maven 4) does
not support "splitting the jars in two", thus commons team is not eager to
split their jars.
D) Java got better, and many commons methods which were great in Java 1.4 days
are not that needed with Java 8+.
---
Sample cases:
* https://github.com/checkstyle/checkstyle/pull/3026 Checkstyle removed
commons-lang3 ~9 years ago
* https://github.com/apache/jmeter/pull/6534 I removed commons-lang3 from
Apache JMeter a couple of weeks ago
> Drop commons-lang3 dependency
> -----------------------------
>
> Key: TIKA-4532
> URL: https://issues.apache.org/jira/browse/TIKA-4532
> Project: Tika
> Issue Type: Improvement
> Affects Versions: 3.2.3
> Reporter: Vladimir Sitnikov
> Priority: Major
>
> Currently, there are only a few commons-lang3 usages in apache tika (see
> https://github.com/search?q=repo%3Aapache%2Ftika%20commons.lang3&type=code ),
> and it would be great if
> commons-lang3 is a big dependency with lots of stuff, and it is unfortunate
> to get CVEs via commons-lang3:
> https://mvnrepository.com/artifact/org.apache.commons/commons-lang3
> See https://github.com/apache/maven-doxia/issues/1006
--
This message was sent by Atlassian Jira
(v8.20.10#820010)