Hi Vanzin,

"spark.authenticate" is working properly for our environment (Spark 2.4 on 
Kubernetes).
We have made few code changes through which secure communication between driver 
and executor is working fine using shared spark.authenticate.secret.

Even SASL encryption works but when we set, 
spark.network.crypto.enabled true
to enable AES based encryption, we see RPC timeout error message sporadically.

Kind Regards,
Breeta


-----Original Message-----
From: Marcelo Vanzin <van...@cloudera.com> 
Sent: Tuesday, March 26, 2019 9:10 PM
To: Sinha, Breeta (Nokia - IN/Bangalore) <breeta.si...@nokia.com>
Cc: user@spark.apache.org
Subject: Re: RPC timeout error for AES based encryption between driver and 
executor

I don't think "spark.authenticate" works properly with k8s in 2.4 (which would 
make it impossible to enable encryption since it requires authentication). I'm 
pretty sure I fixed it in master, though.

On Tue, Mar 26, 2019 at 2:29 AM Sinha, Breeta (Nokia - IN/Bangalore) 
<breeta.si...@nokia.com> wrote:
>
> Hi All,
>
>
>
> We are trying to enable RPC encryption between driver and executor. Currently 
> we're working on Spark 2.4 on Kubernetes.
>
>
>
> According to Apache Spark Security document 
> (https://spark.apache.org/docs/latest/security.html) and our understanding on 
> the same, it is clear that Spark supports AES-based encryption for RPC 
> connections. There is also support for SASL-based encryption, although it 
> should be considered deprecated.
>
>
>
> spark.network.crypto.enabled true , will enable AES-based RPC encryption.
>
> However, when we enable AES based encryption between driver and executor, we 
> could observe a very sporadic behaviour in communication between driver and 
> executor in the logs.
>
>
>
> Follwing are the options and their default values, we used for 
> enabling encryption:-
>
>
>
> spark.authenticate true
>
> spark.authenticate.secret <some-value>
>
> spark.network.crypto.enabled true
>
> spark.network.crypto.keyLength 256
>
> spark.network.crypto.saslFallback false
>
>
>
> A snippet of the executor log is provided below:-
>
> Exception in thread "main" 19/02/26 07:27:08 ERROR RpcOutboxMessage: 
> Ask timeout before connecting successfully
>
> Caused by: java.util.concurrent.TimeoutException: Cannot receive any 
> reply from 
> sts-spark-thrift-server-1551165767426-driver-svc.default.svc:7078 in 
> 120 seconds
>
>
>
> But, there is no error message or any message from executor seen in the 
> driver log for the same timestamp.
>
>
>
> We also tried increasing spark.network.timeout, but no luck.
>
>
>
> This issue is seen sporadically, as the following observations were 
> noted:-
>
> 1) Sometimes, enabling AES encryption works completely fine.
>
> 2) Sometimes, enabling AES encryption works fine for around 10 consecutive 
> spark-submits but next trigger of spark-submit would go into hang state with 
> the above mentioned error in the executor log.
>
> 3) Also, there are times, when enabling AES encryption would not work at all, 
> as it would keep on spawnning more than 50 executors where the executors fail 
> with the above mentioned error.
>
> Even, setting spark.network.crypto.saslFallback to true didn't help.
>
>
>
> Things are working fine when we enable SASL encryption, that is, only 
> setting the following parameters:-
>
> spark.authenticate true
>
> spark.authenticate.secret <some-value>
>
>
>
> I have attached the log file containing detailed error message. Please let us 
> know if any configuration is missing or if any one has faced the same issue.
>
>
>
> Any leads would be highly appreciated!!
>
>
>
> Kind Regards,
>
> Breeta Sinha
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org



--
Marcelo

Reply via email to