Hello, I am trying to deploy a Spark streaming application using the Spark Kubernetes Operator, but the application crashes after a while.
After describing CRD using *kubectl -n my-namespace describe sparkapplication my-app,* I see the following - Qos Class: Guaranteed Start Time: 2025-04-29T12:18:15Z Last Transition Time: 2025-04-29T12:24:37.365547649Z Message: *The Spark application failed to get enough executors in the given time threshold.* My deployment spec looks something like this : apiVersion: spark.apache.org/v1alpha1 kind: SparkApplication metadata: name: my-app-test labels: app: my-app annotations: owner: my-team spec: runtimeVersions: scalaVersion: "2.13" sparkVersion: "3.5.0" mainClass: "com.myorg.MyStreamingApp" jars: "local:///opt/spark/app/my-app-jar-2.2.7.jar" sparkConf: spark.kubernetes.authenticate.driver.serviceAccountName: " some-service-account" spark.executor.instances: "2" spark.executor.memory: "1g" spark.executor.cores: "1" spark.hadoop.fs.s3a.access.key: ***** spark.hadoop.fs.s3a.aws.credentials.provider: org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider spark.hadoop.fs.s3a.endpoint: ***** spark.hadoop.fs.s3a.fast.upload: "true" spark.hadoop.fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem spark.hadoop.fs.s3a.path.style.access: "true" spark.hadoop.fs.s3a.secret.key: ***** spark.kerberos.keytab: local:///mnt/keytabs/some-principal.keytab spark.kerberos.principal: ***** spark.kubernetes.driverEnv.KRB5_CONFIG: /mnt/krb5/krb5.conf spark.kubernetes.executorEnv.KRB5_CONFIG: /mnt/krb5/krb5.conf spark.kubernetes.file.upload.path: s3a://enrichment/tmp spark.kubernetes.kerberos.krb5.configMapName: krb5-config-map spark.sql.streaming.metricsEnabled: "true" spark.kubernetes.container.image: "my-image" applicationTolerations: instanceConfig: minExecutors: 2 initExecutors: 1 maxExecutors: 4 driverSpec: podTemplateSpec: spec: imagePullSecrets: - name: some-nexus containers: - name: spark-driver imagePullPolicy: IfNotPresent resources: limits: cpu: "1000m" ... volumeMounts: - name: app-jar mountPath: "/opt/spark/app" - name: keytabs env: - name: SECRETS_ROOT_DIR value: /mnt/secrets initContainers: - name: download-jar image: "some-image" volumeMounts: - name: app-jar mountPath: "/opt/spark/app" env: - name: NEXUS_USERNAME value: ***** ... command: ["sh", "-c"] args: - "curl -u <rest_of_command>" volumes: - name: app-jar emptyDir: {} - name: keytabs secret: secretName: keytabs metadata: labels: version: "3.5.0" executorSpec: podTemplateSpec: spec: imagePullSecrets: - name: some-nexus containers: - name: spark-executor imagePullPolicy: IfNotPresent resources: limits: cpu: "1000m" ... volumeMounts: - name: app-jar mountPath: "/opt/spark/app" - name: keytabs mountPath: "/mnt/keytabs" env: - name: SECRETS_ROOT_DIR value: /mnt/secrets initContainers: - name: download-jar image: "some-image" volumeMounts: - name: app-jar mountPath: "/opt/spark/app" env: - name: NEXUS_USERNAME value: ***** ... command: ["sh", "-c"] args: - "curl -u <rest_of_command>" volumes: - name: app-jar emptyDir: {} - name: keytabs secret: secretName: keytabs ... metadata: labels: version: "3.5.0" Has anyone faced this issue? Would appreciate any help on this matter. Thanks, Nilanjan Sarkar