chenjunjiedada commented on issue #24879: [SPARK-28042][K8S] Support using hostpath volume mount as local storage URL: https://github.com/apache/spark/pull/24879#issuecomment-505254902 Good point @mccheah, First, I'd like to explain that this patch is not only for hostPath volume but also it suits for other volumes such as PV, etc. it tries to adjust feature build order to adopt volumes which already be supported as first-class features. Second, the hostPath volume can be written by a non-root user if we change the file permission according to the description [here](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath). Lastly, I think users that care about the performance should want this since even they can define volumes inside podTemplate, the volumes are not utilized, I have run spark-sql-perf on spark on Kubernetes with 1T scale, with the patch, most of the shuffle bounded queries can improve a lot, especially for q17, q25, q29, when using 4 disks as local storage instead of emptyDir, the query time improve more than 10X. In summary, this patch is just to help users easily leverage what they have to improve performance.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
