Yes you are right.
I am using Spring Boot for this.

The same does work for the event that does not involve any kafka events.
But again i am not sending out extra jars there so nothing is replaced and
we are using the default ones.

If i do not use the userClassPathFirst which will force the service to use
the newer version i will end up with the same problem

We are using protobuf v3+ and as such we need to push that version since
apache core uses an older version.

So all we should really need is the following : --jars
"protobuf-java-3.17.3.jar" \
and here we need the userClassPathFirst=true in order to use the latest
version.


Using only this jar as it works on local or no jars defined we ended up
with the following error.

21/08/31 10:53:40 WARN  org.apache.spark.scheduler.TaskSetManager: Lost
task 0.0 in stage 18.0 (TID 139) (10.60.63.56 executor 1):
java.lang.ClassNotFoundException:
org.apache.spark.streaming.kafka010.KafkaRDDPartition

at java.base/java.net.URLClassLoader.findClass(Unknown Source)

at java.base/java.lang.ClassLoader.loadClass(Unknown Source)

at java.base/java.lang.ClassLoader.loadClass(Unknown Source)

at java.base/java.lang.Class.forName0(Native Method)

at java.base/java.lang.Class.forName(Unknown Source)

at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)




Which can be resolved with passing more jars.


Any idea about this error ?

K8 does not seem to like this, but Java Spring should be the one that is
responsible for the version but it seems K8 does not like this versions.

Perhaps miss configuration on K8 ?

I haven't set that up so i am not aware of what was done there.



For downgrading to java 8 on my K8 might not be so easy. I want to explore
if there is something else before doing that as we will need to spin off
new instances of K8 to check that.



Thank you for the time taken




On Tue, 31 Aug 2021 at 12:26, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi Stelios,
>
> I've never seen this error before, but a couple of things caught
> my attention that I would look at closer to chase the root cause of the
> issue.
>
> "org.springframework.context.annotation.AnnotationConfigApplicationContext:"
> and "21/08/31 07:28:42 ERROR  org.springframework.boot.SpringApplication:
> Application run failed" seem to indicate that you're using Spring Boot
> (that I know almost nothing about so take the following with a pinch of
> salt :))
>
> Spring Boot manages the classpath by itself and together with another
> interesting option in your
> spark-submit, spark.driver.userClassPathFirst=true, makes me wonder how
> much this exception:
>
> > org.apache.spark.scheduler.ExternalClusterManager:
> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager not a
> subtype
>
> could be due to casting compatible types from two different classloaders?
>
> Just a thought but wanted to share as I think it's worth investigating.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> "The Internals Of" Online Books <https://books.japila.pl/>
> Follow me on https://twitter.com/jaceklaskowski
>
> <https://twitter.com/jaceklaskowski>
>
>
> On Tue, Aug 31, 2021 at 9:44 AM Stelios Philippou <stevo...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I have been facing the current issue for some time now and I was
>> wondering if someone might have some inside on how I can resolve the
>> following.
>>
>> The code (java 11) is working correctly on my local machine but whenever
>> I try to launch the following on K8 I am getting the following error.
>>
>> 21/08/31 07:28:42 ERROR  org.apache.spark.SparkContext: Error
>> initializing SparkContext.
>>
>> java.util.ServiceConfigurationError:
>> org.apache.spark.scheduler.ExternalClusterManager:
>> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager not a
>> subtype
>>
>>
>>
>> I have a spark that will monitor some directories and handle the data
>> accordingly.
>>
>> That part is working correctly on K8 and the SparkContext has no issue
>> being initialized there.
>>
>>
>> This is the spark-submit for that
>>
>>
>> spark-submit \
>> --master=k8s://https://url:port \
>> --deploy-mode cluster \
>> --name a-name\
>> --conf spark.driver.userClassPathFirst=true  \
>> --conf spark.kubernetes.file.upload.path=hdfs://upload-path \
>> --files "application-dev.properties,keystore.jks,truststore.jks"  \
>> --conf spark.kubernetes.container.image=url/spark:spark-submit \
>> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>> --conf spark.kubernetes.namespace=spark \
>> --conf spark.kubernetes.container.image.pullPolicy=Always \
>> --conf spark.dynamicAllocation.enabled=false \
>> --driver-memory 525m --executor-memory 525m \
>> --num-executors 1 --executor-cores 1 \
>> target/SparkStream.jar continuous-merge
>>
>>
>> My issue comes when I try to launch the service in order to listen to
>> kafka events and store them in HDFS.
>>
>>
>> spark-submit \
>> --master=k8s://https://url:port \
>> --deploy-mode cluster \
>> --name consume-data \
>> --conf spark.driver.userClassPathFirst=true  \
>> --conf spark.kubernetes.file.upload.path=hdfs://upload-path\
>> --files "application-dev.properties,keystore.jks,truststore.jks"  \
>> --jars 
>> "spark-yarn_2.12-3.1.2.jar,spark-core_2.12-3.1.2.jar,kafka-clients-2.8.0.jar,spark-streaming-kafka-0-10_2.12-3.1.2.jar,spark-token-provider-kafka-0-10_2.12-3.1.2.jar"
>>  \
>> --conf spark.kubernetes.container.image=url/spark:spark-submit \
>> --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
>> --conf spark.kubernetes.authenticate.executor.serviceAccountName=spark \
>> --conf spark.kubernetes.namespace=spark \
>> --conf spark.kubernetes.container.image.pullPolicy=Always \
>> --conf spark.dynamicAllocation.enabled=false \
>> --driver-memory 1g --executor-memory 1g \
>> --num-executors 1 --executor-cores 1 \
>> target/SparkStream.jar consume
>>
>>
>> It could be that I am launching the application wrongly or perhaps that
>> my K8 is not configured correctly ?
>>
>>
>>
>> I have stripped down my code and left it barebone and will end up with
>> the following issue :
>>
>>
>> 21/08/31 07:28:42 ERROR  org.apache.spark.SparkContext: Error
>> initializing SparkContext.
>>
>> java.util.ServiceConfigurationError:
>> org.apache.spark.scheduler.ExternalClusterManager:
>> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager not a
>> subtype
>>
>> at java.base/java.util.ServiceLoader.fail(Unknown Source)
>>
>> at
>> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(Unknown
>> Source)
>>
>> at
>> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(Unknown
>> Source)
>>
>> at java.base/java.util.ServiceLoader$2.hasNext(Unknown Source)
>>
>> at java.base/java.util.ServiceLoader$3.hasNext(Unknown Source)
>>
>> at
>> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:
>>
>>
>> 21/08/31 07:28:42 WARN  
>> org.springframework.context.annotation.AnnotationConfigApplicationContext:
>> Exception encountered during context initialization - cancelling refresh
>> attempt: org.springframework.beans.factory.UnsatisfiedDependencyException:
>> Error creating bean with name 'mainApplication': Unsatisfied dependency
>> expressed through field 'streamAllKafkaData'; nested exception is
>> org.springframework.beans.factory.UnsatisfiedDependencyException: Error
>> creating bean with name 'streamAllKafkaData': Unsatisfied dependency
>> expressed through field 'javaSparkContext'; nested exception is
>> org.springframework.beans.factory.BeanCreationException: Error creating
>> bean with name 'javaSparkContext' defined in class path resource
>> [com/configuration/SparkConfiguration.class]: Bean instantiation via
>> factory method failed; nested exception is
>> org.springframework.beans.BeanInstantiationException: Failed to instantiate
>> [org.apache.spark.api.java.JavaSparkContext]: Factory method
>> 'javaSparkContext' threw exception; nested exception is
>> java.util.ServiceConfigurationError:
>> org.apache.spark.scheduler.ExternalClusterManager:
>> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager not a
>> subtype
>>
>> 21/08/31 07:28:42 ERROR  org.springframework.boot.SpringApplication:
>> Application run failed
>>
>> org.springframework.beans.factory.UnsatisfiedDependencyException: Error
>> creating bean with name 'mainApplication': Unsatisfied dependency expressed
>> through field 'streamAllKafkaData'; nested exception is
>> org.springframework.beans.factory.UnsatisfiedDependencyException: Error
>> creating bean with name 'streamAllKafkaData': Unsatisfied dependency
>> expressed through field 'javaSparkContext'; nested exception is
>> org.springframework.beans.factory.BeanCreationException: Error creating
>> bean with name 'javaSparkContext' defined in class path resource
>> [com/configuration/SparkConfiguration.class]: Bean instantiation via
>> factory method failed; nested exception is
>> org.springframework.beans.BeanInstantiationException: Failed to instantiate
>> [org.apache.spark.api.java.JavaSparkContext]: Factory method
>> 'javaSparkContext' threw exception; nested exception is
>> java.util.ServiceConfigurationError:
>> org.apache.spark.scheduler.ExternalClusterManager:
>> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager not a
>> subtype
>>
>>
>>
>>
>> It could be that i am launching the application for Kafka wrongly with
>> all the extra jars added ?
>>
>> Just that those seem to be needed or i am getting other errors when not
>> including those.
>>
>>
>>
>>
>> Any help will be greatly appreciated.
>>
>>
>>
>> Cheers,
>>
>> Stelios
>>
>>
>>
>>

Reply via email to