On Mon, 12 Oct 2020 at 19:06, Sean Owen <sro...@gmail.com> wrote: > I don't have a good answer, Steve may know more, but from looking at > dependency:tree, it looks mostly like it's hadoop-common that's at issue. > Without -Phive it remains 'provided' in the assembly/ module, but -Phive > causes it to come back in. Either there's some good reason for that, or, > maybe we need to explicitly manage the scope of hadoop-common along with > everything else Hadoop, even though Spark doesn't reference it directly. > ' >
sorry, missed this. Yes, they should be scoped so that hadoop-provided leaves them out. Open a JIRA, and point me at it and I'll do my best. The artifacts should just go into the hadoop-provided scope, shouldn't they? > On Mon, Oct 12, 2020 at 12:38 PM Kimahriman <adam...@gmail.com> wrote: > >> When I try to build a distribution with either -Phive or -Phadoop-cloud >> along >> with -Phadoop-provided, I still end up with hadoop jars in the >> distribution. >> >> Specifically, with -Phive and -Phadoop-provided, you end up with >> hadoop-annotations, hadoop-auth, and hadoop-common included in the Spark >> jars, and with -Phadoop-cloud and -Phadoop-provided, you end up with >> hadoop-annotations, as well as the hadoop-{aws,azure,openstack} jars. Is >> this supposed to be the case or is there something I'm doing wrong? I just >> want the spark-hive and spark-hadoop-cloud jars without the hadoop >> dependencies, and right now I just have to delete the hadoop jars after >> the >> fact. >> >> >> >> -- >> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>