Hi, I have so far received pretty much no comments on the technical details outlined in the KIP. While I am happy to continue with my own ideas of how to implement this, I would much prefer to at least get a very broad "looks good in principle, but still lots to flesh out" from a few people before I but more work into this.
Best regards, Sönke On Tue, 21 May 2019 at 14:15, Sönke Liebau <soenke.lie...@opencore.com> wrote: > Hi everybody, > > I'd like to rekindle the discussion around KIP-317. > I have reworked the KIP a little bit in order to design everything as a > pluggable implementation. During the course of that work I've also decided > to rename the KIP, as encryption will only be transparent in some cases. It > is now called "Add end to end data encryption functionality to Apache > Kafka" [1]. > > I'd very much appreciate it if you could give the KIP a quick read. This > is not at this point a fully fleshed out design, as I would like to agree > on the underlying structure that I came up with first, before spending time > on details. > > TL/DR is: > Create three pluggable classes: > KeyManager runs on the broker and manages which keys to use, key rollover > etc > KeyProvider runs on the client and retrieves keys based on what the > KeyManager tells it > EncryptionEngine runs on the client andhandles the actual encryption > First idea of control flow between these components can be seen at [2] > > Please let me know any thoughts or concerns that you may have! > > Best regards, > Sönke > > [1] > https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+end-to-end+data+encryption+functionality+to+Apache+Kafka > [2] > https://cwiki.apache.org/confluence/download/attachments/85479936/kafka_e2e-encryption_control-flow.png?version=1&modificationDate=1558439227551&api=v2 > > > > On Fri, 10 Aug 2018 at 14:05, Sönke Liebau <soenke.lie...@opencore.com> > wrote: > >> Hi Viktor, >> >> thanks for your input! We could accommodate magic headers by removing any >> known fixed bytes pre-encryption, sticking them in a header field and >> prepending them after decryption. However, I am not sure whether this is >> actually necessary, as most modern (AES for sure) algorithms are considered >> to be resistant to known-plaintext types of attack. Even if the entire >> plaintext is known to the attacker he still needs to brute-force the key - >> which may take a while. >> >> Something different to consider in this context are compression >> sidechannel attacks like CRIME or BREACH, which may be relevant depending >> on what type of data is being sent through Kafka. Both these attacks depend >> on the encrypted record containing a combination of secret and user >> controlled data. >> For example if Kafka was used to forward data that the user entered on a >> website along with a secret API key that the website adds to a back-end >> server and the user can obtain the Kafka messages, these attacks would >> become relevant. Not much we can do about that except disallow encryption >> when compression is enabled (TLS chose this approach in version 1.3) >> >> I agree with you, that we definitely need to clearly document any risks >> and how much security can reasonably be expected in any given scenario. We >> might even consider logging a warning message when sending data that is >> compressed and encrypted. >> >> On a different note, I've started amending the KIP to make key management >> and distribution pluggable, should hopefully be able to publish sometime >> Monday. >> >> Best regards, >> Sönke >> >> >> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi <viktorsomo...@gmail.com >> > wrote: >> >>> Hi Sönke, >>> >>> Compressing before encrypting has its dangers as well. Suppose you have a >>> known compression format which adds a magic header and you're using a >>> block >>> cipher with a small enough block, then it becomes much easier to figure >>> out >>> the encryption key. For instance you can look at Snappy's stream >>> identifier: >>> https://github.com/google/snappy/blob/master/framing_format.txt >>> . Based on this you should only use block ciphers where block sizes are >>> much larger then 6 bytes. AES for instance should be good with its 128 >>> bits >>> = 16 bytes but even this isn't entirely secure as the first 6 bytes >>> already >>> leaked some information - and it depends on the cypher that how much it >>> is. >>> Also if we suppose that an adversary accesses a broker and takes all the >>> data, they'll have much easier job to decrypt it as they'll have much >>> more >>> examples. >>> So overall we should make sure to define and document the compatible >>> encryptions with the supported compression methods and the level of >>> security they provide to make sure the users are fully aware of the >>> security implications. >>> >>> Cheers, >>> Viktor >>> >>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau >>> <soenke.lie...@opencore.com.invalid> wrote: >>> >>> > Hi Stephane, >>> > >>> > thanks for pointing out the broken pictures, I fixed those. >>> > >>> > Regarding encrypting before or after batching the messages, you are >>> > correct, I had not thought of compression and how this changes things. >>> > Encrypted data does not really encrypt well. My reasoning at the time >>> > of writing was that if we encrypt the entire batch we'd have to wait >>> > for the batch to be full before starting to encrypt. Whereas with per >>> > message encryption we can encrypt them as they come in and more or >>> > less have them ready for sending when the batch is complete. >>> > However I think the difference will probably not be that large (will >>> > do some testing) and offset by just encrypting once instead of many >>> > times, which has a certain overhead every time. Also, from a security >>> > perspective encrypting longer chunks of data is preferable - another >>> > benefit. >>> > >>> > This does however take away the ability of the broker to see the >>> > individual records inside the encrypted batch, so this would need to >>> > be stored and retrieved as a single record - just like is done for >>> > compressed batches. I am not 100% sure that this won't create issues, >>> > especially when considering transactions, I will need to look at the >>> > compression code some more. In essence though, since it works for >>> > compression I see no reason why it can't be made to work here. >>> > >>> > On a different note, going down this route might make us reconsider >>> > storing the key with the data, as this might significantly reduce >>> > storage overhead - still much higher than just storing them once >>> > though. >>> > >>> > Best regards, >>> > Sönke >>> > >>> > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek >>> > <steph...@simplemachines.com.au> wrote: >>> > > Hi Sonke >>> > > >>> > > Very much needed feature and discussion. FYI the image links seem >>> broken. >>> > > >>> > > My 2 cents (if I understood correctly): you say "This process will be >>> > > implemented after Serializer and Interceptors are done with the >>> message >>> > > right before it is added to the batch to be sent, in order to ensure >>> that >>> > > existing serializers and interceptors keep working with encryption >>> just >>> > > like without it." >>> > > >>> > > I think encryption should happen AFTER a batch is created, right >>> before >>> > it >>> > > is sent. Reason is that if we want to still keep advantage of >>> > compression, >>> > > encryption needs to happen after it (and I believe compression >>> happens >>> > on a >>> > > batch level). >>> > > So to me for a producer: serializer / interceptors => batching => >>> > > compression => encryption => send. >>> > > and the inverse for a consumer. >>> > > >>> > > Regards >>> > > Stephane >>> > > >>> > > On 19 June 2018 at 06:46, Sönke Liebau <soenke.lie...@opencore.com >>> > .invalid> >>> > > wrote: >>> > > >>> > >> Hi everybody, >>> > >> >>> > >> I've created a draft version of KIP-317 which describes the addition >>> > >> of transparent data encryption functionality to Kafka. >>> > >> >>> > >> Please consider this as a basis for discussion - I am aware that >>> this >>> > >> is not at a level of detail sufficient for implementation, but I >>> > >> wanted to get some feedback from the community on the general idea >>> > >> before spending more time on this. >>> > >> >>> > >> Link to the KIP is: >>> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>> > >> 317%3A+Add+transparent+data+encryption+functionality >>> > >> >>> > >> Best regards, >>> > >> Sönke >>> > >> >>> > >>> > >>> > >>> > -- >>> > Sönke Liebau >>> > Partner >>> > Tel. +49 179 7940878 >>> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany >>> > >>> >> >> >> >> -- >> Sönke Liebau >> Partner >> Tel. +49 179 7940878 >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany >> > > > -- > Sönke Liebau > Partner > Tel. +49 179 7940878 > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany > -- Sönke Liebau Partner Tel. +49 179 7940878 OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany