Hello Fang-Yu Rao, Norbert Luksa, Kurt Deschler, Zoltan Borok-Nagy, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14963 to look at the new patch set (#9). Change subject: IMPALA-9010: Add builtin mask functions ...................................................................... IMPALA-9010: Add builtin mask functions There're 6 builtin GenericUDFs for column masking in Hive: mask_show_first_n(value, charCount, upperChar, lowerChar, digitChar, otherChar, numberChar) mask_show_last_n(value, charCount, upperChar, lowerChar, digitChar, otherChar, numberChar) mask_first_n(value, charCount, upperChar, lowerChar, digitChar, otherChar, numberChar) mask_last_n(value, charCount, upperChar, lowerChar, digitChar, otherChar, numberChar) mask_hash(value) mask(value, upperChar, lowerChar, digitChar, otherChar, numberChar, dayValue, monthValue, yearValue) Description of the parameters: value - value to mask. Supported types: TINYINT, SMALLINT, INT, BIGINT, STRING, VARCHAR, CHAR, DATE(only for mask()). charCount - number of characters. Default value: 4 upperChar - character to replace upper-case characters with. Specify -1 to retain original character. Default value: 'X' lowerChar - character to replace lower-case characters with. Specify -1 to retain original character. Default value: 'x' digitChar - character to replace digit characters with. Specify -1 to retain original character. Default value: 'n' otherChar - character to replace all other characters with. Specify -1 to retain original character. Default value: -1 numberChar - character to replace digits in a number with. Valid values: 0-9. Default value: '1' dayValue - value to replace day field in a date with. Specify -1 to retain original value. Valid values: 1-31. Default value: 1 monthValue - value to replace month field in a date with. Specify -1 to retain original value. Valid values: 0-11. Default value: 0 yearValue - value to replace year field in a date with. Specify -1 to retain original value. Default value: 1 In Hive, these functions accept variable length of arguments in non-restricted types: mask_show_first_n(val) mask_show_first_n(val, 8) mask_show_first_n(val, 8, 'X', 'x', 'n') mask_show_first_n(val, 8, 'x', 'x', 'x', 'x', 2) mask_show_first_n(val, 8, 'x', -1, 'x', 'x', '9') The arguments of upperChar, lowerChar, digitChar, otherChar and numberChar can be in string or numeric types. We currently don't have a corresponding framework for GenericUDF (IMPALA-9271), so we implement these by overloads. However, it may requires hundreds of overloads to cover all possible combinations. We just implement some important overloads, including - those used by Ranger default masking policies, - those with simple arguments and may be useful for users, - an overload with all arguments in int type for full functionality. Char argument need to be converted to their ASCII value. Tests: - Add BE tests in expr-test Change-Id: Ica779a1bf63a085d51f3b533f654cbaac102a664 --- M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/expr-test.cc A be/src/exprs/mask-functions-ir.cc A be/src/exprs/mask-functions.h M be/src/exprs/scalar-expr-evaluator.cc M common/function-registry/impala_functions.py 7 files changed, 1,605 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/14963/9 -- To view, visit http://gerrit.cloudera.org:8080/14963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ica779a1bf63a085d51f3b533f654cbaac102a664 Gerrit-Change-Number: 14963 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Norbert Luksa <norbert.lu...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>