Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/22081
  
    I've just pushed up my PR which is ~ in sync with this one; I'll close that 
one now and this can be the one to use. 
    
    Assume: kinesis uses bouncy castle somewhere.  There's some hints in the 
AWS docs
    
    [Encrypt and Decrypt Amazon Kinesis Records Using AWS 
KMS](https://aws.amazon.com/blogs/big-data/encrypt-and-decrypt-amazon-kinesis-records-using-aws-kms/)
 covers end-to-end encryption of Kinesis records. For this you need the AWS 
encryption SDK, whose docs [say you need bouncy 
castle](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/java.html).
    
    And it looks like the AWS encryption SDK does explicitly [depend on bouncy 
castle](http://mvnrepository.com/artifact/com.amazonaws/aws-encryption-sdk-java/1.3.5).
    
    Imagine if *somehow* the removal of bouncy castle as a java crypto provider 
was stopping that round trip working with some of the encrypt/decrypt not 
happening. In which case adding bouncy castle should fix things. It worked 
before because jets3t in spark-core added bouncy castle, and the last 
bouncy-castle version update made it in sync with kinesis (and broke jets3t, 
but nobody has noticed...)
    
    But
    * There's no refs to javax.crypto, the aws crypto libs or calls to the 
class `KinesisEncryptionUtils`referenced in the blog post in the spark kinesis 
module (it's not in the latest SDKs either(
    * There's no build-time dependency on the aws-sdk encryption, which would 
transitively pull in the bouncy castle stuff.
    * Looking through the aws-sdk-bundle: no refs to javax.crypto in the 
kinesis code; encryption refs limited to the PUT request where you can request 
server-side encryption with a given KMS key. 
    * Nor is there any `com.aws.encryptionsdk` in that bundle, or shaded bouncy 
castle (which is good, as otherwise I'd have to deal with the fact that some 
ASF projects were shipping a shaded version of it unknowingly)
    
    It could just be a strong java crypto provided is needed, and in the 
absence of the unlimited java crypto JAR in the JDK lib dir (where it's needed 
for kerberos to work), bouncy-castle needs to be on the CP.
    
    What to do?
    
    1. you can remove jets3t independent of the bouncy castle changes, because 
Kinesis isn't going to be using jets3t. The aws-s3 module significantly 
supercedes the jets3t client's functionality, and is the only one you'd expect 
the other parts of the AWS SDK to pick up. 
    1. the bouncy-castle dependency could be upgraded to a later version in the 
kinesis module(s) alone, and explicitly added to kinesis-asl.
    1. Someone needs to do some experiments with what happens to the test suite 
with/without the full JCE and bouncy castle, maybe including more details on 
whats not matching up in the round trip tests
    1 Maybe including some new test which somehow explores what encryption 
algorithms/keys you get with/without the BC  and JCE-unlimited JARs
    
    
    
    
    
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to