Hi,
I would like to make a proposal change to AVRO to allow services to integrate
some logic after serialization and before deserialization.
We use AVRO to support the data serialization in our streaming infrastructure
and we decided to extend it to provide us the possibility to encrypt the data
with info available directly on the data itself: the owner of it.
The change-set is pretty small and I would like to hear from you if it makes
sense to contribute it back to the project.
== The problem is:
Multi-tenants applications have the need to encrypt data (with the keys of the
owner/tenant that generated that piece of data) every time it is serialized to
avoid commingling of different tenant data. To do so, transparently to the
application, the ideal place to implement the encryption it is in the
serialization library (AVRO).
== Proposal:
We modified the AVRO code to have afterSerialization and beforeDeserialization
hooks that can use object defined values (the tenant/owner of that data) to
implement encryption.
In the code we propose to submit we implemented a new interface:
`SerializeFinalizationDelegate.java`
```
public interface SerializeFinalizationDelegate {
void afterSerialization(ByteArrayOutputStream serializedData, Encoder
finalEncoder);
Decoder beforeDeserialization(Decoder dataToDecode);
}
```
That needs to be implemented by any AVRO serializable class that wants to
define a post-serialization or pre-deserialization logic.
`GenericDatumWriter` and `GenericDatumReader` are modified to delegate to the
object implementation of the methods above.
More info can be found at
https://www.slideshare.net/FlinkForward/multi-tenanted-streams-workday-enrico-agnoli-leire-fernandez-de-retana-roitegui-workday-185815223
from slide 21
What do you think about this proposal? I wanted to first start a discussion,
but if it helps I can create a patch or a branch to show the change,
Hope to hear from you,
-Enrico