skonto opened a new pull request #23546: [SPARK-23153][K8s] Support client dependencies for HCFS URL: https://github.com/apache/spark/pull/23546 ## What changes were proposed in this pull request? - solves the issue with --packages. Keep in mind of some [issues](https://issues.apache.org/jira/browse/SPARK-22657) of the past. - use a custom scheme to denote local deps. Uploads the deps to the HCFS. Then the driver serves the deps via the Spark file server. TODO: add integration tests using [minio](https://github.com/minio/cookbook/blob/master/docs/apache-spark-with-minio.md). ## How was this patch tested? - Run integration test suite. - Run an example using S3: ``` ./bin/spark-submit \ --verbose \ --master k8s://https://..... \ --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.6 \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.memory=1G \ --conf spark.kubernetes.namespace=spark \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ --conf spark.driver.memory=1G \ --conf spark.executor.instances=2 \ --conf spark.sql.streaming.metricsEnabled=true \ --conf "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp" \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image=skonto/spark:k8s-3.0.0 \ --conf spark.kubernetes.file.upload.path=s3a://fdp-stavros-test \ --conf spark.hadoop.fs.s3a.access.key=... \ --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \ --conf spark.hadoop.fs.s3a.fast.upload=true \ --conf spark.kubernetes.executor.deleteOnTermination=false \ --conf spark.hadoop.fs.s3a.secret.key=... \ --conf spark.files=client:///home/ubuntu/resolv.conf \ client:///my.jar ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
