Hi Andrew, thanks for your feedback! I am interested though, why are you doubtful about getting a committer to volunteer an opinion? Shouldn't this be in their interest as well?
I'll just continue along for now and start building a very rough poc implementation based on what's in the KIP so far to flesh out more details and add them to the KIP as I go along. Best regards, Sönke On Wed, 7 Aug 2019 at 18:18, Andrew Schofield <andrew_schofi...@live.com> wrote: > Hi, > I think this is a useful KIP and it looks good in principle. While it can > all be done using > interceptors, if the brokers do not know anything about it, you need to > maintain the > mapping from topics to key ids somewhere external. I'd prefer the way > you've done it. > > I'm not sure whether you'll manage to interest any committers in > volunteering an > opinion, and you'll need that before you can get the KIP accepted into > Kafka. > > Thanks, > Andrew Schofield (IBM) > > On 06/08/2019, 15:46, "Sönke Liebau" <soenke.lie...@opencore.com.INVALID> > wrote: > > Hi, > > I have so far received pretty much no comments on the technical details > outlined in the KIP. While I am happy to continue with my own ideas of > how > to implement this, I would much prefer to at least get a very broad > "looks > good in principle, but still lots to flesh out" from a few people > before I > but more work into this. > > Best regards, > Sönke > > > > > On Tue, 21 May 2019 at 14:15, Sönke Liebau <soenke.lie...@opencore.com > > > wrote: > > > Hi everybody, > > > > I'd like to rekindle the discussion around KIP-317. > > I have reworked the KIP a little bit in order to design everything > as a > > pluggable implementation. During the course of that work I've also > decided > > to rename the KIP, as encryption will only be transparent in some > cases. It > > is now called "Add end to end data encryption functionality to Apache > > Kafka" [1]. > > > > I'd very much appreciate it if you could give the KIP a quick read. > This > > is not at this point a fully fleshed out design, as I would like to > agree > > on the underlying structure that I came up with first, before > spending time > > on details. > > > > TL/DR is: > > Create three pluggable classes: > > KeyManager runs on the broker and manages which keys to use, key > rollover > > etc > > KeyProvider runs on the client and retrieves keys based on what the > > KeyManager tells it > > EncryptionEngine runs on the client andhandles the actual encryption > > First idea of control flow between these components can be seen at > [2] > > > > Please let me know any thoughts or concerns that you may have! > > > > Best regards, > > Sönke > > > > [1] > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-317%253A%2BAdd%2Bend-to-end%2Bdata%2Bencryption%2Bfunctionality%2Bto%2BApache%2BKafka&data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&sdata=GwcvmfILdjTZBxOseHR4IjUY0oMG3%2BKEjFNHo3pJlvc%3D&reserved=0 > > [2] > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdownload%2Fattachments%2F85479936%2Fkafka_e2e-encryption_control-flow.png%3Fversion%3D1%26modificationDate%3D1558439227551%26api%3Dv2&data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&sdata=FcMoNEliLn48OZfWca1TCQv%2BiIlRNqJNQvU52UfkbEs%3D&reserved=0 > > > > > > > > On Fri, 10 Aug 2018 at 14:05, Sönke Liebau < > soenke.lie...@opencore.com> > > wrote: > > > >> Hi Viktor, > >> > >> thanks for your input! We could accommodate magic headers by > removing any > >> known fixed bytes pre-encryption, sticking them in a header field > and > >> prepending them after decryption. However, I am not sure whether > this is > >> actually necessary, as most modern (AES for sure) algorithms are > considered > >> to be resistant to known-plaintext types of attack. Even if the > entire > >> plaintext is known to the attacker he still needs to brute-force > the key - > >> which may take a while. > >> > >> Something different to consider in this context are compression > >> sidechannel attacks like CRIME or BREACH, which may be relevant > depending > >> on what type of data is being sent through Kafka. Both these > attacks depend > >> on the encrypted record containing a combination of secret and user > >> controlled data. > >> For example if Kafka was used to forward data that the user entered > on a > >> website along with a secret API key that the website adds to a > back-end > >> server and the user can obtain the Kafka messages, these attacks > would > >> become relevant. Not much we can do about that except disallow > encryption > >> when compression is enabled (TLS chose this approach in version 1.3) > >> > >> I agree with you, that we definitely need to clearly document any > risks > >> and how much security can reasonably be expected in any given > scenario. We > >> might even consider logging a warning message when sending data > that is > >> compressed and encrypted. > >> > >> On a different note, I've started amending the KIP to make key > management > >> and distribution pluggable, should hopefully be able to publish > sometime > >> Monday. > >> > >> Best regards, > >> Sönke > >> > >> > >> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi < > viktorsomo...@gmail.com > >> > wrote: > >> > >>> Hi Sönke, > >>> > >>> Compressing before encrypting has its dangers as well. Suppose you > have a > >>> known compression format which adds a magic header and you're > using a > >>> block > >>> cipher with a small enough block, then it becomes much easier to > figure > >>> out > >>> the encryption key. For instance you can look at Snappy's stream > >>> identifier: > >>> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fsnappy%2Fblob%2Fmaster%2Fframing_format.txt&data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&sdata=qe9szbUHLk81YmrbL7WeK%2Bse2LAB44vN%2FEOL7PT7wbE%3D&reserved=0 > >>> . Based on this you should only use block ciphers where block > sizes are > >>> much larger then 6 bytes. AES for instance should be good with its > 128 > >>> bits > >>> = 16 bytes but even this isn't entirely secure as the first 6 bytes > >>> already > >>> leaked some information - and it depends on the cypher that how > much it > >>> is. > >>> Also if we suppose that an adversary accesses a broker and takes > all the > >>> data, they'll have much easier job to decrypt it as they'll have > much > >>> more > >>> examples. > >>> So overall we should make sure to define and document the > compatible > >>> encryptions with the supported compression methods and the level of > >>> security they provide to make sure the users are fully aware of the > >>> security implications. > >>> > >>> Cheers, > >>> Viktor > >>> > >>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau > >>> <soenke.lie...@opencore.com.invalid> wrote: > >>> > >>> > Hi Stephane, > >>> > > >>> > thanks for pointing out the broken pictures, I fixed those. > >>> > > >>> > Regarding encrypting before or after batching the messages, you > are > >>> > correct, I had not thought of compression and how this changes > things. > >>> > Encrypted data does not really encrypt well. My reasoning at the > time > >>> > of writing was that if we encrypt the entire batch we'd have to > wait > >>> > for the batch to be full before starting to encrypt. Whereas > with per > >>> > message encryption we can encrypt them as they come in and more > or > >>> > less have them ready for sending when the batch is complete. > >>> > However I think the difference will probably not be that large > (will > >>> > do some testing) and offset by just encrypting once instead of > many > >>> > times, which has a certain overhead every time. Also, from a > security > >>> > perspective encrypting longer chunks of data is preferable - > another > >>> > benefit. > >>> > > >>> > This does however take away the ability of the broker to see the > >>> > individual records inside the encrypted batch, so this would > need to > >>> > be stored and retrieved as a single record - just like is done > for > >>> > compressed batches. I am not 100% sure that this won't create > issues, > >>> > especially when considering transactions, I will need to look at > the > >>> > compression code some more. In essence though, since it works for > >>> > compression I see no reason why it can't be made to work here. > >>> > > >>> > On a different note, going down this route might make us > reconsider > >>> > storing the key with the data, as this might significantly reduce > >>> > storage overhead - still much higher than just storing them once > >>> > though. > >>> > > >>> > Best regards, > >>> > Sönke > >>> > > >>> > On Tue, Jun 19, 2018 at 5:59 AM, Stephane Maarek > >>> > <steph...@simplemachines.com.au> wrote: > >>> > > Hi Sonke > >>> > > > >>> > > Very much needed feature and discussion. FYI the image links > seem > >>> broken. > >>> > > > >>> > > My 2 cents (if I understood correctly): you say "This process > will be > >>> > > implemented after Serializer and Interceptors are done with the > >>> message > >>> > > right before it is added to the batch to be sent, in order to > ensure > >>> that > >>> > > existing serializers and interceptors keep working with > encryption > >>> just > >>> > > like without it." > >>> > > > >>> > > I think encryption should happen AFTER a batch is created, > right > >>> before > >>> > it > >>> > > is sent. Reason is that if we want to still keep advantage of > >>> > compression, > >>> > > encryption needs to happen after it (and I believe compression > >>> happens > >>> > on a > >>> > > batch level). > >>> > > So to me for a producer: serializer / interceptors => batching > => > >>> > > compression => encryption => send. > >>> > > and the inverse for a consumer. > >>> > > > >>> > > Regards > >>> > > Stephane > >>> > > > >>> > > On 19 June 2018 at 06:46, Sönke Liebau < > soenke.lie...@opencore.com > >>> > .invalid> > >>> > > wrote: > >>> > > > >>> > >> Hi everybody, > >>> > >> > >>> > >> I've created a draft version of KIP-317 which describes the > addition > >>> > >> of transparent data encryption functionality to Kafka. > >>> > >> > >>> > >> Please consider this as a basis for discussion - I am aware > that > >>> this > >>> > >> is not at a level of detail sufficient for implementation, > but I > >>> > >> wanted to get some feedback from the community on the general > idea > >>> > >> before spending more time on this. > >>> > >> > >>> > >> Link to the KIP is: > >>> > >> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-&data=02%7C01%7C%7Cc858aa722cc9434ba98d08d71a7cd547%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637006995760557724&sdata=B7QY%2BWj4zwvdQkqVxLH0dpecT%2BEQzuR1luIctYTQFN8%3D&reserved=0 > >>> > >> 317%3A+Add+transparent+data+encryption+functionality > >>> > >> > >>> > >> Best regards, > >>> > >> Sönke > >>> > >> > >>> > > >>> > > >>> > > >>> > -- > >>> > Sönke Liebau > >>> > Partner > >>> > Tel. +49 179 7940878 > >>> > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - > Germany > >>> > > >>> > >> > >> > >> > >> -- > >> Sönke Liebau > >> Partner > >> Tel. +49 179 7940878 > >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - > Germany > >> > > > > > > -- > > Sönke Liebau > > Partner > > Tel. +49 179 7940878 > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany > > > > > -- > Sönke Liebau > Partner > Tel. +49 179 7940878 > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany > > > -- Sönke Liebau Partner Tel. +49 179 7940878 OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany