Github user winningsix commented on the pull request:

    https://github.com/apache/spark/pull/8880#issuecomment-186069349
  
    Hi @vanzin, I have a suggestion about the usage of configuration. 
    Currently we convert the Spark crypto related configuration to Chimera 
configuration by their prefix (“spark.shuffle.crypto.”).  We can take a 
close look at the configuration about shuffle file encryption.  We could 
classify them into two catalogs: 1) common configurations including 
“spark.shuffle.encryption.enabled”, 
“spark.shuffle.encryption.keySizeBits”, 
“spark.shuffle.encryption.keygen.algorithm” and 
“spark.shuffle.crypto.cipher.transformation”; 2) implementations (Chimera 
or Apache common crypto later) related configurations including 
“spark.shuffle.crypto.cipher.classes” and 
“spark.shuffle.crypto.secure.random.classes”.  I prefer not to expose 
configurations in the second catalog to users.  At the same time, we need to 
change the method ```toChimeraConf``` in class ```CryptoStreamUtils```.  In 
that method, we will filter all properties related to the library to crypto 
library (Chimera or Apache common crypto later). Then we don’t need Spark con
 f prefix and use library prefix only. We could obtain the following benefits.
     1. Now crypto related configuration name is following the conversion 
[Spark_prefix]+[Chimera property key exclude Chimera prefix]. With this change, 
Spark will get rid of the restrictions on the configuration name for crypto 
related library and they still configure the library using the original library 
configuration.
     2. After Chimera is governed by Apache common, we can easily change the 
prefix and don’t require other changes.
     3. It’s good for backward compatible. For example, user configured 
property “spark.shuffle.crypto.secure.random.classes” with the value of 
“com.intel.chimera.classname”. If the time Chimera is accepted by Apache is 
after what Spark released. If user upgrades, the value has to be updated as 
well.
    
    Changes will include:
     1. Remove configurations “spark.shuffle.crypto.cipher.classes” and 
“spark.shuffle.crypto.secure.random.classes” from Spark side
     2. Remove the variable ```SPARK_CHIMERA_CONF_PREFIX``` in 
```CryptoStreamUtils```
     3. Change the implementation of ```toChimeraConf``` in 
```CryptoStreamUtils``` like follows:
    ```
      def toCryptoConf(conf: SparkConf, cryptoLibPrefix: String): Properties = {
        val props = new Properties()
        conf.getAll.foreach { case (k, v) =>
          if (k.startsWith(cryptoLibPrefix)) {
            props.put(k, v)
          }
        }
        props
      }
    ```
    Any thoughts about this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to