huangzhir commented on PR #4643: URL: https://github.com/apache/kyuubi/pull/4643#issuecomment-1492857548
> I think we should follow the same making rule implementation jas in Ranger's DataMasker. And just mask all the chars into `"x"` without classifying in upper/lower/digits or any other language type. @yaooqinn @huangzhir > > ([https://github.com/apache/ranger/blob/000db1cbeb0f8fed1e02e88edbdc80386023b515/plugin-nestedstructure/src/main/java/org/apache/ranger/authorization/nestedstructure/authorizer/DataMasker.java#L250)](https://github.com/apache/ranger/blob/000db1cbeb0f8fed1e02e88edbdc80386023b515/plugin-nestedstructure/src/main/java/org/apache/ranger/authorization/nestedstructure/authorizer/DataMasker.java#L250%EF%BC%89), > > ``` > private static String showFirstFour(String value){ > int length = StringUtils.length(value); > > return length <= 4 ? value : value.substring(0, 4) + StringUtils.repeat("x", length - 4); > } > ``` If we don't refer to the implementations of Hive and Trino and there are no legacy issues, the natural way to implement data masking is to directly replace characters with "x". This PR is just a patch that fixes Chinese data masking and maintains some compatibility with Hive. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
