Fellow Hadoop developers, Hadoop codebase depends on commons-httpclient, and its latest version, 3.1.2, is EOL nearly 5 years ago. But because its API is not compatible with its successor, httpclient 4, the community seem to have been reluctant to upgrade. However, a lot of evidence indicates that commons-httpclient has a number of security vulnerabilities which are never addressed, including CVE-2012-6153. To make Hadoop less susceptible to existing and future vulnerabilities, we should seriously consider replacing commons-httpclient with httpclient 4.x.
There are a few Hadoop JIRAs that have patches available to address that, but they really need more attention to get them committed: HADOOP-10105 <https://issues.apache.org/jira/browse/HADOOP-10105> (remove httpclient dependency) is the umbrella JIRA for all. Other efforts includes HADOOP-11613 <https://issues.apache.org/jira/browse/HADOOP-11613> (Remove httpclient dependency from hadoop-azure), HADOOP-11614 <https://issues.apache.org/jira/browse/HADOOP-11614> (Remove httpclient dependency from hadoop-openstack), HADOOP-12710 <https://issues.apache.org/jira/browse/HADOOP-12710> (Remove dependency on commons-httpclient for TestHttpServerLogs), HADOOP-12711 <https://issues.apache.org/jira/browse/HADOOP-12711> (Remove dependency on commons-httpclient for ServletUtil). I’d also like to urge the community to reject patches that imports commons-httpclient in the future. Additionally, Hadoop trunk depends on httpclient 4.2.5, which is known to suffer from several security vulnerabilities as well, including CVE-2012-6153, CVE-2011-4461, CVE-2014-3577, CVE-2015-5262. HADOOP-12767 <https://issues.apache.org/jira/browse/HADOOP-12767> (update apache httpclient version to the latest 4.5 for security) has a patch that bumps the version to 4.5.1. But I’d like to ask the community whether we should do it or not, and the implication of bump the latest version. Best regards, Wei-Chiu Chuang A very happy Clouderan