Re: Service Account not being honored using pyspark on Kubernetes

2020-01-29 Thread pisymbol .
On Wed, Jan 29, 2020 at 9:58 PM pisymbol .  wrote:

>
>
> On Wed, Jan 29, 2020 at 5:02 PM pisymbol .  wrote:
>
>>
>> The problem is when spark initailizes I see the following error:
>>
>> io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden:
>> User "system:serviceaccount:default:default" cannot watch resource "pods"
>> in
>> API group "" in the namespace "spark"
>>
>>
> If I deploy my "driver" notebook pod in the spark namespace then things
> improve slightly:
>
> " Forbidden!Configured service account doesn't have access. Service
> account may have been revoked. pods is forbidden: User
> "system:serviceaccount:spark:default" cannot list resource "pods"
>
> Again, I don't want spark:default I want spark:spark for the service
> account. Why aren't my configuration parameters taking?
>

For the pour soul that reads this thread and runs into the same issue, the
fix is to set the serviceAccount in your deployment for the pod to "spark".
I'm not sure why this has to be done but it works.

-aps


Re: Service Account not being honored using pyspark on Kubernetes

2020-01-29 Thread pisymbol .
On Wed, Jan 29, 2020 at 5:02 PM pisymbol .  wrote:

>
> The problem is when spark initailizes I see the following error:
>
> io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden:
> User "system:serviceaccount:default:default" cannot watch resource "pods"
> in
> API group "" in the namespace "spark"
>
>
If I deploy my "driver" notebook pod in the spark namespace then things
improve slightly:

" Forbidden!Configured service account doesn't have access. Service account
may have been revoked. pods is forbidden: User
"system:serviceaccount:spark:default" cannot list resource "pods"

Again, I don't want spark:default I want spark:spark for the service
account. Why aren't my configuration parameters taking?

-aps


Service Account not being honored using pyspark on Kubernetes

2020-01-29 Thread pisymbol .
I am on k8s 1.17 in a small 4 node cluster. I am running Spark 2.4.4 but
with
updated kubernetes-client jars to work around the 403 CVE issue.

I am running on a pod in the 'default' namespace of my cluster in a Jupyter
notebook. I am trying to configure 'client mode' so I can use pyspark
interactively and watch work done on the executors.

Here is my SparkConf:

sparkConf = SparkConf()
sparkConf.setMaster("k8s://https://192.168.0.100:6443;)
sparkConf.setAppName("pispark")
sparkConf.set("spark.kubernetes.container.image",
"pidocker-docker-registry:5000/my-spark-py:v2.4.4")
sparkConf.set("spark.kubernetes.namespace", "spark")
sparkConf.set("spark.executor.instances", "3")
sparkConf.set("spark.driver.memory", "512m")
sparkConf.set("spark.executor.memory", "512m")
sparkConf.set("spark.kubernetes.pyspark.pythonVersion", 3)
sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName",
"spark")
sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark")
sparkConf.set("spark.kubernetes.pullSecrets",
"pidocker-docker-registry-secret")

spark = SparkSession.builder.config(conf=sparkConf).getOrCreate()
sc = spark.sparkContext

The problem is when spark initailizes I see the following error:

io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden:
User "system:serviceaccount:default:default" cannot watch resource "pods" in
API group "" in the namespace "spark"

But I am not using "default:default" I am using "spark:spark" which has
"edit" access via a clusterrolebinding in that namespace:

$  k describe clusterrolebinding/spark-role -n spark
Name: spark-role
Labels:   
Annotations:  
Role:
  Kind:  ClusterRole
  Name:  edit
Subjects:
  KindName   Namespace
     -
  ServiceAccount  spark  spark

What am I doing wrong?

-aps