Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

2021-04-12 Thread bran...@bbrownsound.com
It’s been a while, so I thought I’d give this one another friendly bump. 

> On Sep 21, 2020, at 9:38 AM, Brandon Brown  wrote:
> 
> Hi Tom,
> 
> The reason I went fix was so that we could simplify the configuration for 
> example you can say sha256 instead of having to remember that it’s SHA-256. 
> Admittedly if other formats become implemented then it would require updating 
> this as well. 
> 
> I’m flexible on changing it to a string and letting it be configured with the 
> exact name. What do you think Mickael?
> 
> Brandon Brown
> 
>> On Sep 21, 2020, at 3:42 AM, Tom Bentley  wrote:
>> 
>> Hi Brandon and Mickael,
>> 
>> Is it necessary to fix the supported digest? We could just support whatever
>> the JVM's MessageDigest supports?
>> 
>> Kind regards,
>> 
>> Tom
>> 
>>> On Fri, Sep 18, 2020 at 6:00 PM Brandon Brown 
>>> wrote:
>>> 
>>> Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256.
>>> 
>>> I can expand the motivation on the KIP but here’s where my head is at.
>>> MaskField would completely remove the value by setting it to an equivalent
>>> null value. One problem with this would be that you’d not be able to know
>>> in the case of say a password going through the mask transform it would
>>> become “” which could mean that no password was present in the message, or
>>> it was removed. However this hash transformer would remove this ambiguity
>>> if that makes sense.
>>> 
>>> Do you think there are other hash functions that should be supported as
>>> well?
>>> 
>>> Thanks,
>>> Brandon Brown
>>> 
 On Sep 18, 2020, at 12:00 PM, Mickael Maison 
>>> wrote:
 
 Thanks Brandon for the KIP.
 
 There's already a built-in transformation (MaskField) that can
 obfuscate fields. In the motivation section, it would be nice to
 explain the use cases when MaskField is not suitable and when users
 would need the proposed transformation.
 
 The KIP exposes a "function" configuration to select the hash function
 to use. Which hash functions do you propose supporting?
 
> On Thu, Aug 27, 2020 at 10:43 PM  wrote:
> 
> 
> 
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
> 
> The current pr with the proposed changes
> https://github.com/apache/kafka/pull/9057 and the original 3rd party
> contribution which initiated this change
> 
>>> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057
>>> .
> 
> I'm interested in any suggestions for ways to improve this as I think
> it would make a nice addition to the existing SMTs provided by Kafka
> Connect out of the box.
> 
> Thanks,
> Brandon
> 
> 
> 
>>> 



Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-09-21 Thread Brandon Brown
Hi Tom,

The reason I went fix was so that we could simplify the configuration for 
example you can say sha256 instead of having to remember that it’s SHA-256. 
Admittedly if other formats become implemented then it would require updating 
this as well. 

I’m flexible on changing it to a string and letting it be configured with the 
exact name. What do you think Mickael?

Brandon Brown

> On Sep 21, 2020, at 3:42 AM, Tom Bentley  wrote:
> 
> Hi Brandon and Mickael,
> 
> Is it necessary to fix the supported digest? We could just support whatever
> the JVM's MessageDigest supports?
> 
> Kind regards,
> 
> Tom
> 
>> On Fri, Sep 18, 2020 at 6:00 PM Brandon Brown 
>> wrote:
>> 
>> Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256.
>> 
>> I can expand the motivation on the KIP but here’s where my head is at.
>> MaskField would completely remove the value by setting it to an equivalent
>> null value. One problem with this would be that you’d not be able to know
>> in the case of say a password going through the mask transform it would
>> become “” which could mean that no password was present in the message, or
>> it was removed. However this hash transformer would remove this ambiguity
>> if that makes sense.
>> 
>> Do you think there are other hash functions that should be supported as
>> well?
>> 
>> Thanks,
>> Brandon Brown
>> 
>>> On Sep 18, 2020, at 12:00 PM, Mickael Maison 
>> wrote:
>>> 
>>> Thanks Brandon for the KIP.
>>> 
>>> There's already a built-in transformation (MaskField) that can
>>> obfuscate fields. In the motivation section, it would be nice to
>>> explain the use cases when MaskField is not suitable and when users
>>> would need the proposed transformation.
>>> 
>>> The KIP exposes a "function" configuration to select the hash function
>>> to use. Which hash functions do you propose supporting?
>>> 
 On Thu, Aug 27, 2020 at 10:43 PM  wrote:
 
 
 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
 
 The current pr with the proposed changes
 https://github.com/apache/kafka/pull/9057 and the original 3rd party
 contribution which initiated this change
 
>> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057
>> .
 
 I'm interested in any suggestions for ways to improve this as I think
 it would make a nice addition to the existing SMTs provided by Kafka
 Connect out of the box.
 
 Thanks,
 Brandon
 
 
 
>> 


Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-09-21 Thread Tom Bentley
Hi Brandon and Mickael,

Is it necessary to fix the supported digest? We could just support whatever
the JVM's MessageDigest supports?

Kind regards,

Tom

On Fri, Sep 18, 2020 at 6:00 PM Brandon Brown 
wrote:

> Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256.
>
> I can expand the motivation on the KIP but here’s where my head is at.
> MaskField would completely remove the value by setting it to an equivalent
> null value. One problem with this would be that you’d not be able to know
> in the case of say a password going through the mask transform it would
> become “” which could mean that no password was present in the message, or
> it was removed. However this hash transformer would remove this ambiguity
> if that makes sense.
>
> Do you think there are other hash functions that should be supported as
> well?
>
> Thanks,
> Brandon Brown
>
> > On Sep 18, 2020, at 12:00 PM, Mickael Maison 
> wrote:
> >
> > Thanks Brandon for the KIP.
> >
> > There's already a built-in transformation (MaskField) that can
> > obfuscate fields. In the motivation section, it would be nice to
> > explain the use cases when MaskField is not suitable and when users
> > would need the proposed transformation.
> >
> > The KIP exposes a "function" configuration to select the hash function
> > to use. Which hash functions do you propose supporting?
> >
> >> On Thu, Aug 27, 2020 at 10:43 PM  wrote:
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
> >>
> >> The current pr with the proposed changes
> >> https://github.com/apache/kafka/pull/9057 and the original 3rd party
> >> contribution which initiated this change
> >>
> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057
> .
> >>
> >> I'm interested in any suggestions for ways to improve this as I think
> >> it would make a nice addition to the existing SMTs provided by Kafka
> >> Connect out of the box.
> >>
> >> Thanks,
> >> Brandon
> >>
> >>
> >>
>


Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-09-18 Thread Brandon Brown
Thanks Michael! So proposed hash functions would be MD5, SHA1, SHA256. 

I can expand the motivation on the KIP but here’s where my head is at. 
MaskField would completely remove the value by setting it to an equivalent null 
value. One problem with this would be that you’d not be able to know in the 
case of say a password going through the mask transform it would become “” 
which could mean that no password was present in the message, or it was 
removed. However this hash transformer would remove this ambiguity if that 
makes sense. 

Do you think there are other hash functions that should be supported as well?

Thanks,
Brandon Brown

> On Sep 18, 2020, at 12:00 PM, Mickael Maison  wrote:
> 
> Thanks Brandon for the KIP.
> 
> There's already a built-in transformation (MaskField) that can
> obfuscate fields. In the motivation section, it would be nice to
> explain the use cases when MaskField is not suitable and when users
> would need the proposed transformation.
> 
> The KIP exposes a "function" configuration to select the hash function
> to use. Which hash functions do you propose supporting?
> 
>> On Thu, Aug 27, 2020 at 10:43 PM  wrote:
>> 
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>> 
>> The current pr with the proposed changes
>> https://github.com/apache/kafka/pull/9057 and the original 3rd party
>> contribution which initiated this change
>> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.
>> 
>> I'm interested in any suggestions for ways to improve this as I think
>> it would make a nice addition to the existing SMTs provided by Kafka
>> Connect out of the box.
>> 
>> Thanks,
>> Brandon
>> 
>> 
>> 


Re: [DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-09-18 Thread Mickael Maison
Thanks Brandon for the KIP.

There's already a built-in transformation (MaskField) that can
obfuscate fields. In the motivation section, it would be nice to
explain the use cases when MaskField is not suitable and when users
would need the proposed transformation.

The KIP exposes a "function" configuration to select the hash function
to use. Which hash functions do you propose supporting?

On Thu, Aug 27, 2020 at 10:43 PM  wrote:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT
>
> The current pr with the proposed changes
> https://github.com/apache/kafka/pull/9057 and the original 3rd party
> contribution which initiated this change
> https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.
>
> I'm interested in any suggestions for ways to improve this as I think
> it would make a nice addition to the existing SMTs provided by Kafka
> Connect out of the box.
>
> Thanks,
> Brandon
>
>
>


[DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-09-15 Thread Brandon Brown
Hey Everybody I wondered if I could get some feedback on this. In a recent 
discussion it was brought up that there might be value in supporting the 
ability to hash multiple fields on a value message. Was wondering what y’all 
think and also if there’s any feedback on the current proposed KIP and the 
sample PR I’ve submitted. 

Thanks!
-Brandon Brown

[DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-08-31 Thread Brandon Brown
Hey everybody, I’ve created the following and would love some feedback. One 
place where this could be of use would be to say hashing the key used as an 
identifier for inserting into elasticsearch (which has a size limit) or 
obfuscating sensitive values like say passwords or ssn. 

https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT

The current pr with the proposed changes 
https://github.com/apache/kafka/pull/9057 and the original 3rd party 
contribution which initiated this change 
https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.

I'm interested in any suggestions for ways to improve this as I think it would 
make a nice addition to the existing SMTs provided by Kafka Connect out of the 
box.

Thanks,
Brandon

[DISCUSS] KIP-665 Kafka Connect Hash SMT

2020-08-27 Thread brandon



https://cwiki.apache.org/confluence/display/KAFKA/KIP-665%3A+Kafka+Connect+Hash+SMT

The current pr with the proposed changes  
https://github.com/apache/kafka/pull/9057 and the original 3rd party  
contribution which initiated this change  
https://github.com/aiven/aiven-kafka-connect-transforms/issues/9#issuecomment-662378057.


I'm interested in any suggestions for ways to improve this as I think  
it would make a nice addition to the existing SMTs provided by Kafka  
Connect out of the box.


Thanks,
Brandon