[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1798: Avro 3532 naming in c

GitBox Mon, 01 Aug 2022 10:44:30 -0700


KalleOlaviNiemitalo commented on code in PR #1798:
URL: https://github.com/apache/avro/pull/1798#discussion_r934770067



##########
lang/c/src/schema.c:
##########
@@ -48,26 +51,50 @@ static void avro_schema_init(avro_schema_t schema, 
avro_type_t type)
 
 static int is_avro_id(const char *name)
 {
-       size_t i, len;
        if (name) {
-               len = strlen(name);
-               if (len < 1) {
-                       return 0;
-               }
-               for (i = 0; i < len; i++) {
-                       if (!(isalpha(name[i])
-                             || name[i] == '_' || (i && isdigit(name[i])))) {
+               size_t len = strlen(name);
+       if (len < 1) {
+               return 0;
+       }
+
+       locale_t loc = newlocale(LC_ALL_MASK, "en_US.UTF-8", (locale_t) 0);
+       locale_t currentLoc = (locale_t) 0;
+       if (loc) {
+            currentLoc = uselocale(loc);
+        }
+        else {
+            setlocale(LC_ALL, "en_US.UTF-8");
+        }
+
+           size_t mbslen = mbstowcs(NULL, name, 0);
+           wchar_t  wsName[mbslen + 1];
+        mbstowcs(wsName, name, mbslen + 1);

Review Comment:
   
<https://unicode-org.github.io/icu/userguide/strings/properties.html#enumerated-property-over-string>
 mentions "UTF-8 macros" that would apparently let you look up properties of a 
UTF-8 encoded character without first recoding to wchar_t. If you could use 
that and rely solely on ICU rather than set up a locale, then the code would be 
more easily portable to Windows.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1798: Avro 3532 naming in c

Reply via email to