skonto commented on a change in pull request #24702: [SPARK-27989] [Kubernetes] 
[Core] Added retries on the connection to the driver for k8s
URL: https://github.com/apache/spark/pull/24702#discussion_r292404320
 
 

 ##########
 File path: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
 ##########
 @@ -51,6 +51,8 @@ ENV SPARK_HOME /opt/spark
 
 WORKDIR /opt/spark/work-dir
 RUN chmod g+w /opt/spark/work-dir
+#Disable negative dns reslolution 
https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html
+RUN sed -i -e 
's/networkaddress.cache.negative.ttl=10/networkaddress.cache.negative.ttl=0/g'  
/usr/lib/jvm/java-1.8-openjdk/jre/lib/security/java.security
 
 Review comment:
   @jlpedrosa
   >  I saw the option of sending the whole file, I thought it was too 
complicated for people, and I think this issue is mostly present in K8s.
   
   I am not saying to send the whole the file, that is one option, user could 
just create it on the fly in his entrypoint script based on env vars or do the 
`sed` in there based on the env vars. In my custom image for example I have a 
Prometheus config file that works for all my deployments, users may even choose 
to have these security properties predefined in the file in the image. I dont 
see why you need to hardcode them in the Dockerfile, it is restrictive, because 
if I dont want that as the default I will need to derive from the dockerfile. 
   Anyway my 2 cents.
   Btw the dockerfile you are modifying is about k8s only so what you modify it 
affects that only.
   
   > AFAIK that DNS negative lookup is NOT cached, not at least in linux, 
positive caching yes, retries and timeouts, and it also depends on 
distribution, not all of them have it enabled (and there are different ways to 
achieve so).
   
   It seems it is enabled by default in ubuntu: 
https://github.com/systemd/systemd/issues/5552#issuecomment-499701256, in my 
system (fresh bionic install) I have `Cache=yes` btw by default, that means all 
caching is enabled by default. Also check [here 
](https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1668771) and 
[here](https://github.com/systemd/systemd/blob/7895b07a720210ff8c0d46c695cda32e7b5fa9e6/src/resolve/resolved-dns-cache.c#L731)
 for the existence of negative caching in systemd, of course ubuntu is one OS. 
The fact that ubuntu uses systemd is also verified 
[here](https://askubuntu.com/questions/2219/how-do-i-clear-the-dns-cache/929478#929478
 ).
   In general this is not supported at the OS level as you mentioned but might 
be with a service: 
https://stackoverflow.com/questions/11020027/dns-caching-in-linux
   In that case disabling java cache wont do much as you will need to purge the 
OS one correct?
   
   > What will happen if we don't disable caching, the first call to resolve, 
will wait the OS timeout (I think 5 seconds is the default), then the 
subsequent calls will just don't even try because the java layer won't try to 
invoke it. here and here
   
   It sounds reasonable and I general im ok with the PR as long setting it up 
is not enforced at the Dockerfile level.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to