Hello, I am working on getting Hadoop running within our organization. Our high-level use case is to be able to say we're running with end-to-end encryption. It looks like there are two major strategies for getting this done in Hadoop:
A) HDFS Transparent Encryption: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html B) Secure Mode: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SecureMode.html In our case, we are less concerned with Kerberos' user-level authentication, but we want node-to-node encryption and encryption-at-rest. With cluster applications, I typically achieve encryption-at-rest with LUKS and then enable an application's TLS settings to achieve encryption-in-motion. What is my best strategy for Hadoop? Here are a couple of questions: 1) The docs say I have to create a new directory, but can I configure HDFS Transparent Encryption to operate across an entire volume? 2) If I just need encrypted-in-motion, can I do just the "Data confidentiality" part of the Secure Mode doc, or does that depend on setting up Kerberos? Thank You! -danny -- http://dannyman.toldme.com