Hi Mich, True the vulnerable jar (hive-metastore-2.3.9.jar) is not directly related to Spark. And completely agree, “Spark does not run a Hive metastore itself nor use Hive for executing queries.”
Like Nicholas said, When looking at vulnerabilities, many security teams, including ours, have begun to look at them as Vulnerable or Affected. Vulnerable being, directly impacted by the vulnerability and exploitable; while Affected is indicating if a vulnerable dependency/package/jar is being delivered with a product. With that said, if a user accidentally uses one of these dependents in their Spark application; with Java CLASSPATH, set the $SPARK_HOME/jars as precedent and in turn expose the unknowing end user to a vulnerability that way? I am also new to this mailing list and discussions. Not sure on this “Can you connect the CVE to Spark?” Pls help with this ! Thanks, Balaji From: Sean Owen <sro...@gmail.com> Sent: 28 January 2025 10:31 To: Balaji Sudharsanam V <balaji.sudharsa...@ibm.com> Cc: Mich Talebzadeh <mich.talebza...@gmail.com>; dev <dev@spark.apache.org> Subject: [EXTERNAL] Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions Can you connect the CVE to Spark? Spark does not run a Hive metastore itself nor use Hive for executing queries. It is a Hive client in general. That seems to be what is affected. We ask people reporting issues to at least provide a plausible Can you connect the CVE to Spark? Spark does not run a Hive metastore itself nor use Hive for executing queries. It is a Hive client in general. That seems to be what is affected. We ask people reporting issues to at least provide a plausible theory for a vulnerability. Just because A depends on B does not mean it's use of B includes all vulnerabilities in B. We do not pursue reports that just note a dependency has a vulnerability on that basis alone. Of course, all else equal, you just update dependencies. Hive is hard to update. (I am referring to Mich's replies. I understand the CVE is real) On Mon, Jan 27, 2025, 10:54 PM Balaji Sudharsanam V <balaji.sudharsa...@ibm.com<mailto:balaji.sudharsa...@ibm.com>> wrote: Sean, The vulnerability is explained here, Apache Hive security bypass CVE-2021-34538 Vulnerability Report<https://exchange.xforce.ibmcloud.com/vulnerabilities/231404 > It’s CVSS base score is 7.5 and it is not an AI gen content for sure. We can dig into the vulnerability though, but it can be a different discussion. As long as Spark contains a vulnerable jar in its packaging, it is good to address this. Thanks, Balaji From: Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> Sent: 27 January 2025 20:41 To: Sean Owen <sro...@gmail.com<mailto:sro...@gmail.com>> Cc: Balaji Sudharsanam V <balaji.sudharsa...@ibm.com<mailto:balaji.sudharsa...@ibm.com>>; dev@spark.apache.org<mailto:dev@spark.apache.org> Subject: [EXTERNAL] Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions To answer your question, I did not read this CVE, but I am responding solely from my previous experiences with vulennabiries and the thread owner implications, having used spark in conjunction with Spark for many years. Mich Talebzadeh, Architect To answer your question, I did not read this CVE, but I am responding solely from my previous experiences with vulennabiries and the thread owner implications, having used spark in conjunction with Spark for many years. Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR [https://ci3.googleusercontent.com/mail-sig/AIorK4zholKucR2Q9yMrKbHNn-o1TuS4mYXyi2KO6Xmx6ikHPySa9MLaLZ8t2hrA6AUcxSxDgHIwmKE] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ > On Mon, 27 Jan 2025 at 15:03, Sean Owen <sro...@gmail.com<mailto:sro...@gmail.com>> wrote: Mich: did you read the CVE? I'm not clear, as this contains no reference to the Hive functionality that is affected, or how it might relate to a metastore. Please explain. Otherwise this looks like a generic AI-generated response with no particularly relevant content. "In summary"... On Mon, Jan 27, 2025 at 8:57 AM Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: I think the thread owner's point is valid. The default use of the Hive Metastore by Spark further gives credence to the importance of addressing this Hive vulnerability to ensure the security and reliability of Spark applications. I use Hive as the default metastore for Spark as well. Spark relies heavily on the Hive Metastore for managing critical metadata, such as table schemas, data locations, and access control, unless you are using a platform like Databricks with a unified catalog. In summary, this dependency makes it essential to address any vulnerabilities within the Hive Metastore, as they can indirectly impact the security and stability of Spark applications among other things HTH Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR [https://ci3.googleusercontent.com/mail-sig/AIorK4zholKucR2Q9yMrKbHNn-o1TuS4mYXyi2KO6Xmx6ikHPySa9MLaLZ8t2hrA6AUcxSxDgHIwmKE] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ > On Mon, 27 Jan 2025 at 13:37, Sean Owen <sro...@gmail.com<mailto:sro...@gmail.com>> wrote: It looks like that affects Hive, and not the metastore. I do not see that it is relevant to Spark at first glance. On Mon, Jan 27, 2025 at 1:21 AM Balaji Sudharsanam V <balaji.sudharsa...@ibm.com.invalid<mailto:balaji.sudharsa...@ibm.com.invalid>> wrote: Hi All, There is a vulnerability with ‘High’ severity found in the Apache Spark 3.x and 4.0.0 preview (2) releases, with the hive-metastore-2.3.x.jar. This is defined here, Apache Hive security bypass CVE-2021-34538 Vulnerability Report<https://exchange.xforce.ibmcloud.com/vulnerabilities/231404 > The recommendation is to use upgrade to the latest version of Apache Hive (3.1.3, 4.0 or later), available from the Apache Web site. Can we expect this getting fixed in the Apache Spark 4.0 GA ? Thanks, Balaji