Over the weekend I started looking at what it would take to add data encryption to Kudu (besides using filesystem encryption via dm-crypt or something like that).
Here are a few notes - please feel free to comment on them and add suggestions: - reading through this mailing list, it looks like this feature has been asked a couple of times but last year, but from what I can tell, noone is currently working on it. - a client-based approach to encryption like the one used by HDFS wouldn't work (at least out of the box) because for instance encrypting the primary key at the client would prevent being able to have range filters for scans; it might work for the columns that are not part of the primary key - there's already code in Kudu for several compression codecs (LZ4, gzip, etc); I thought it would be possible to add similar code for encryption codecs (to be applied after the compression, of course) - the WAL log files and delta files should be similarly encrypted too - not sure what would be the best way to manage the key - I see that in HDFS they use a double key mechanism, where the encryption key for the data file is itself encrypted with the allowed user key and this whole process is managed by an external Key Management Service Thanks in advance for your ideas and suggestions, Franco
