Hi all
I've upgraded my test cluster to spark 3 and change my comitter to
directory and I still get this error.. The documentations are somehow
obscure on that.
Do I need to add a third party jar to support new comitters?

java.lang.ClassNotFoundException:
org.apache.spark.internal.io.cloud.PathOutputCommitProtocol


On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu <murat.migdiso...@gmail.com>
wrote:

> Hello all,
> we have a hadoop cluster (using yarn) using  s3 as filesystem with s3guard
> is enabled.
> We are using hadoop 3.2.1 with spark 2.4.5.
>
> When I try to save a dataframe in parquet format, I get the following
> exception:
> java.lang.ClassNotFoundException:
> com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol
>
> My relevant spark configurations are as following:
>
> "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
> "fs.s3a.committer.name": "magic",
> "fs.s3a.committer.magic.enabled": true,
> "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
>
> While spark streaming fails with the exception above, apache beam succeeds
> writing parquet files.
> What might be the problem?
>
> Thanks in advance
>
>
> --
> "Talkers aren’t good doers. Rest assured that we’re going there to use
> our hands, not our tongues."
> W. Shakespeare
>


-- 
"Talkers aren’t good doers. Rest assured that we’re going there to use our
hands, not our tongues."
W. Shakespeare

Reply via email to