*FieldEncrypt Transform Design Document*
This design document expands on the proposal described in the following
issue: https://github.com/apache/seatunnel/issues/10246
------------------------------
*Configuration*
transform {
FieldEncrypt {
fields = ["phone", "email"]
algorithm = "AES_CBC"
key = "base64:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" # Base64-encoded key
mode = "ENCRYPT" # ENCRYPT or DECRYPT
}
}
*Parameters*
*Parameter* *Type* *Required* *Default* *Description*
fields List<String> Yes - List of fields to encrypt/decrypt
algorithm Enum No AES_CBC Encryption algorithm
key String Yes - Base64-encoded encryption key
mode Enum No ENCRYPT Operation mode: ENCRYPT or DECRYPT
------------------------------
*Key Design Decisions* *1. Encryption Algorithms*
- AES/CBC/PKCS5Padding (random IV)
- For AES/CBC, a random IV is generated per record and appended to
the ciphertext.
------------------------------
*2. Supported Data Types*
*Decision*: *String fields only*
- Can be extended to other types in future versions if needed
------------------------------
*3. Encryption/Decryption Mode*
*Decision*: transform with mode parameter
Decryption or encryption errors will cause the job to fail fast.
4. Key
The decoded key length must be valid for the selected AES variant (e.g.,
16/24/32 bytes).
Invalid keys will cause job startup failure.
------------------------------
*Usage Examples* *Example 1: Encryption*
transform {
FieldEncrypt {
fields = ["phone", "email"]
key = "base64:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
mode = "ENCRYPT"
}
}
------------------------------
*Example 2: Multi-Table Transform*
source {
FakeSource {
tables_configs = [
{
row.num = 100
schema = {
table = "test.abc"
columns = [
{ name = "id", type = "bigint" },
{ name = "name", type = "string" },
{ name = "address", type = "string" }
]
}
},
{
row.num = 100
schema = {
table = "test.xyz"
columns = [
{ name = "id", type = "bigint" },
{ name = "name", type = "string" },
{ name = "age", type = "int" }
]
}
},
{
row.num = 100
schema = {
table = "test.www"
columns = [
{ name = "id", type = "bigint" },
{ name = "name", type = "string" },
{ name = "age", type = "int" }
]
}
}
]
}
}
transform {
FieldEncrypt {
fields = ["age"]
key = "base64:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
algorithm = "AES_CBC"
mode = "ENCRYPT"
table_transform = [
{
table_path = "test.abc"
fields = ["address"]
}
]
}
}
sink {
Console {}
}
------------------------------
*Example 3: Decryption*
transform {
FieldEncrypt {
fields = ["phone", "email"]
key = "base64:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
mode = "DECRYPT"
}
}
------------------------------
*Note*: Initial implementation will focus on core functionality, with plans
for future feature expansion.
I appreciate any feedback or suggestions for improvements. If you have
ideas for additional features or enhancements, please let us know in the
comments.
Thank you for your input!
Best Regards,
Doyeon
(Github ID: dybyte)