[GitHub] [dolphinscheduler] blackberrier commented on issue #5648: [Feature][SparkTask] Spark/Flink Task can run on kubernets

GitBox Fri, 25 Jun 2021 04:10:46 -0700


blackberrier commented on issue #5648:
URL: 
https://github.com/apache/dolphinscheduler/issues/5648#issuecomment-868423184



   > when use spark on k8s(spark 2.4.8 ), I make a list what we need to do:
   > 
   > 1. change dolphinscheudler from hdfs to minio;
   > 2. change kill yarn job by applicationId(both in worker and master)  to 
stop spark-driver-pod.
   > 3. without yarn log aggregate, have to manual aggregate spark executor log 
by ELK or some other tool.
   > 4. change the spark job monitor way from yarn-rest-api to k8s-api;
   >    5...
   > 
   > @blackberrier anything else please do some complement ?
   > 
   > I'm doing a POC to migrate from yarn to k8s
   
   @geosmart You have considered this issue comprehensively, I think maybe we 
can add the following one?
   1. build spark docker images, and when submitting applications let user 
choose or fill the docker name and version parameters. And other parameters as 
well?
   
   Ps,  maybe we should consider more about your 3rd and 4th point. In yarn 
side, we get yarn application id from logs and filter pattern like 
"application_xxxxxxxxxx_yyyy", and get application state through yarn-rest-api. 
As to kubernetes, we don't have application pattern like yarn, maybe we should 
add some pattern to pods or something?
   
   Ps, as we discussed through email, maybe this issue is not that hurry? 
@CalvinKirs 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [dolphinscheduler] blackberrier commented on issue #5648: [Feature][SparkTask] Spark/Flink Task can run on kubernets

Reply via email to