xkrogen commented on pull request #31591:
URL: https://github.com/apache/spark/pull/31591#issuecomment-782437318


   Good point about other resource managers. I'm not familiar enough with 
Kubernetes or Mesos to have any opinion there or understand if the same problem 
exists. Hopefully someone more knowledgeable can chime in.
   
   One other idea I've been toying with is to treat this similar to 
`spark.sql.hive.metastore.jars`:
   ```
     val HIVE_METASTORE_JARS = buildStaticConf("spark.sql.hive.metastore.jars")
       .doc(s"""
         | Location of the jars that should be used to instantiate the 
HiveMetastoreClient.
         | This property can be one of four options:
   ...
         | 3. "path"
         |   Use Hive jars configured by `spark.sql.hive.metastore.jars.path`
         |   in comma separated format. Support both local or remote paths.The 
provided jars
         |   should be the same version as ${HIVE_METASTORE_VERSION}.
         | 4. A classpath in the standard format for both Hive and Hadoop. The 
provided jars
         |   should be the same version as ${HIVE_METASTORE_VERSION}.
         """.stripMargin)
   ```
   With this config (besides options 1 and 2 which aren't relevant here), you 
can either supply classpath entries like `dep1.jar:dep2.jar` or, if you use the 
`path` option you use a file path with no special handling. If you specify an 
absolute local path, the responsibility is yours to make sure that local path 
exists on all worker nodes. This also reminds me that an absolute local path 
which is expected to be present on all nodes _is_ a valid use case, which may 
be a reason not to go with the current proposal where we intercept the 
`spark.jars.ivySettings` and always put it into the distributed cache.
   
   Similar to this Hive JARs conf, we could enhance `spark.jars.ivySettings` to 
accept a URI instead of just a local file path, allowing for either local or 
remote paths. We can set up some custom scheme like 
`classpath://ivysettings.xml` to indicate that the URI points to a classpath 
entry. Users have the flexibility to use some local path deployed to all nodes, 
point to a single remote path, or put the file onto the classpath using a 
mechanism like `--jars`, placing it into `SPARK_CONF_DIR`, etc. URIs without a 
scheme are treated as local paths for backwards-compatibility. WDYT?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to