I don't have a good answer, Steve may know more, but from looking at
dependency:tree, it looks mostly like it's hadoop-common that's at issue.
Without -Phive it remains 'provided' in the assembly/ module, but -Phive
causes it to come back in. Either there's some good reason for that, or,
maybe we need to explicitly manage the scope of hadoop-common along with
everything else Hadoop, even though Spark doesn't reference it directly.

On Mon, Oct 12, 2020 at 12:38 PM Kimahriman <adam...@gmail.com> wrote:

> When I try to build a distribution with either -Phive or -Phadoop-cloud
> along
> with -Phadoop-provided, I still end up with hadoop jars in the
> distribution.
>
> Specifically, with -Phive and -Phadoop-provided, you end up with
> hadoop-annotations, hadoop-auth, and hadoop-common included in the Spark
> jars, and with -Phadoop-cloud and -Phadoop-provided, you end up with
> hadoop-annotations, as well as the hadoop-{aws,azure,openstack} jars. Is
> this supposed to be the case or is there something I'm doing wrong? I just
> want the spark-hive and spark-hadoop-cloud jars without the hadoop
> dependencies, and right now I just have to delete the hadoop jars after the
> fact.
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to