hoshinojyunn commented on code in PR #64794:
URL: https://github.com/apache/doris/pull/64794#discussion_r3491321928


##########
fe/fe-core/src/main/java/org/apache/doris/analysis/InvertedIndexUtil.java:
##########
@@ -134,15 +134,43 @@ public static void checkInvertedIndexParser(String 
indexColName, PrimitiveType c
         }
     }
 
-    private static boolean isSingleByte(String str) {
+    private static boolean isAscii(String str) {
         for (int i = 0; i < str.length(); i++) {
-            if (str.charAt(i) > 0xFF) {
+            if (str.charAt(i) > 0x7F) {
                 return false;
             }
         }
         return true;
     }
 
+    public static void checkCharFilterProperties(Map<String, String> 
properties) throws AnalysisException {

Review Comment:
   Fixed. 
   `CharReplaceCharFilterValidator` now maps policy `pattern`/`replacement` to 
the shared legacy char-filter property keys and reuses 
`InvertedIndexUtil.checkCharFilterProperties()`, so the 
custom-analyzer/index-policy path now enforces the same ASCII-only replacement 
rule as table properties and `TOKENIZE()`. 
   Added FE coverage in `PolicyValidatorTests` for non-ASCII replacement and a 
custom-analyzer regression case in `test_custom_analyzer1.groovy`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to