Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17580 )
Change subject: IMPALA-2019(Part-2): Provide UTF-8 support in instr() and locate() ...................................................................... Patch Set 10: Code-Review+2 Sure, resolving illegal UTF8 can be postponed to IMPALA-10761, where I hope that we can resolve it better. My concern is the complexity added to deal with such characters for the entire UTF8 feature. On paper, such complexity can be reduced/managed as follows. 1. A common place to check validity of UTF8 characters and raise error if necessary; 2. New UTF8 functions that only deal with UTF8 strings. It is possible that we can do this in FE where a non-trusted source S as an input to a UTF8 func F() is translated to F(CHECK(S)), where CHECK() implements step 1). -- To view, visit http://gerrit.cloudera.org:8080/17580 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic13c3d04649c1aea56c1aaa464799b5e4674f662 Gerrit-Change-Number: 17580 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Sun, 18 Jul 2021 12:27:38 +0000 Gerrit-HasComments: No
