Hi Rayanne,

You raise some good points there.

Similarly, if the whole record is encrypted, it becomes impossible to do
> joins, group bys etc, which just need the record key and maybe don't have
> access to the encryption key. Maybe only record _values_ should be
> encrypted, and maybe Kafka Streams could defer decryption until the actual
> value is inspected. That way joins etc are possible without the encryption
> key, and RocksDB would not need to decrypt values before materializing to
> disk.
>

It's getting a bit late here, so maybe I overlooked something, but wouldn't
the natural thing to do be to make the "encrypted" key a hash of the
original key, and let the value of the encrypted value be the cipher text
of the (original key, original value) pair. A scheme like this would
preserve equality of the key (strictly speaking there's a chance of
collision of course). I guess this could also be a solution for the
compacted topic issue Sönke mentioned.

Cheers,

Tom



On Thu, May 7, 2020 at 5:17 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:

> Thanks Sönke, this is an area in which Kafka is really, really far behind.
>
> I've built secure systems around Kafka as laid out in the KIP. One issue
> that is not addressed in the KIP is re-encryption of records after a key
> rotation. When a key is compromised, it's important that any data encrypted
> using that key is immediately destroyed or re-encrypted with a new key.
> Ideally first-class support for end-to-end encryption in Kafka would make
> this possible natively, or else I'm not sure what the point would be. It
> seems to me that the brokers would need to be involved in this process, so
> perhaps a client-first approach will be painting ourselves into a corner.
> Not sure.
>
> Another issue is whether materialized tables, e.g. in Kafka Streams, would
> see unencrypted or encrypted records. If we implemented the KIP as written,
> it would still result in a bunch of plain text data in RocksDB everywhere.
> Again, I'm not sure what the point would be. Perhaps using custom serdes
> would actually be a more holistic approach, since Kafka Streams etc could
> leverage these as well.
>
> Similarly, if the whole record is encrypted, it becomes impossible to do
> joins, group bys etc, which just need the record key and maybe don't have
> access to the encryption key. Maybe only record _values_ should be
> encrypted, and maybe Kafka Streams could defer decryption until the actual
> value is inspected. That way joins etc are possible without the encryption
> key, and RocksDB would not need to decrypt values before materializing to
> disk.
>
> This is why I've implemented encryption on a per-field basis, not at the
> record level, when addressing kafka security in the past. And I've had to
> build external pipelines that purge, re-encrypt, and re-ingest records when
> keys are compromised.
>
> This KIP might be a step in the right direction, not sure. But I'm hesitant
> to support the idea of end-to-end encryption without a plan to address the
> myriad other problems.
>
> That said, we need this badly and I hope something shakes out.
>
> Ryanne
>
> On Tue, Apr 28, 2020, 6:26 PM Sönke Liebau
> <soenke.lie...@opencore.com.invalid> wrote:
>
> > All,
> >
> > I've asked for comments on this KIP in the past, but since I didn't
> really
> > get any feedback I've decided to reduce the initial scope of the KIP a
> bit
> > and try again.
> >
> > I have reworked to KIP to provide a limited, but useful set of features
> for
> > this initial KIP and laid out a very rough roadmap of what I'd envision
> > this looking like in a final version.
> >
> > I am aware that the KIP is currently light on implementation details, but
> > would like to get some feedback on the general approach before fully
> > speccing everything.
> >
> > The KIP can be found at
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+end-to-end+data+encryption+functionality+to+Apache+Kafka
> >
> >
> > I would very much appreciate any feedback!
> >
> > Best regards,
> > Sönke
> >
>

Reply via email to