kevinclcn opened a new issue, #3409:
URL: https://github.com/apache/incubator-kyuubi/issues/3409

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the 
[issues](https://github.com/apache/incubator-kyuubi/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Describe the feature
   
   目前Kyuubi 
Engine可以运行在Yarn或K8s上以执行通过JDBC提交的任务,但在云原生环境里,通常云提供商都提供弹性的云计算资源,比如阿里云的MaxCompute和AWS
 Glue。如果Kyuubi Engine支持运行在MaxCompute和Glue上,可以大大降低Spark的运行成本和维护成本。
   
   阿里云的通过MaxCompute运行spark任务的API:
   https://help.aliyun.com/document_detail/102357.html
   
   AWS的通过Glue运行spark任务的API:
   
https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html#aws-glue-api-jobs-job-CreateJob
   
   ### Motivation
   
   目前Kyuubi Engine只能运行在Yarn或K8S上,这样在云原生的环境里要么需要申请EMR资源,要么需要申请K8S计算节点,这里存在两个问题:
   1. EMR和K8S的资源不是弹性的,当任务少时,不能缩容以减少硬件成本,当任务多时,不能扩容,以提高计算速度。
   2. 
在云环境中,如果使用MaxCompute这样的弹性计算资源,JDBC只能使用Trino这样的交互式查询引擎,造成离线任务和交互式查询的SQL标准不完全一致。
   
   ### Describe the solution
   
   通过将Kyuubi Engine运行在MaxCompute和Glue这种弹性Spark计算资源上,可以让离线批量任务和交互式查询共用相同的spark 
sql能力,也可以让计算资源有弹性,节省基础设施成本和运维成本。
   
   ### Additional context
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to