JinwooHwang opened a new pull request, #7928: URL: https://github.com/apache/geode/pull/7928
## Fix lexical nondeterminism warning in OQL grammar between `ALL_UNICODE` and `DIGIT` rules ### Problem The ANTLR grammar generation for the OQL (Object Query Language) parser was producing a **lexical nondeterminism warning** during builds: > This warning occurred in the `RegionNameCharacter` lexer rule due to overlapping character ranges between the `ALL_UNICODE` and `DIGIT` rules. The `ALL_UNICODE` rule was defined as a broad range (`'\u0061'..'\ufffd'`) that included all the Unicode digit ranges explicitly defined in the `DIGIT` rule, creating lexical ambiguity. ### Root Cause When the lexer encountered Unicode digits (e.g., Arabic-Indic digits `٠-٩`, Devanagari digits `०-९`, etc.) in region names, it couldn't deterministically choose between: - Matching them as part of `ALL_UNICODE` - Matching them as `DIGIT` characters This created nondeterminism between alternatives 1 (`ALL_UNICODE`) and 3 (`DIGIT`) in the `RegionNameCharacter` rule. ### Solution Refactored the `ALL_UNICODE` rule to **explicitly exclude all Unicode digit ranges** defined in the `DIGIT` rule, eliminating character range overlap. This ensures: - Unicode digits are only matched by the `DIGIT` rule - `ALL_UNICODE` covers all other Unicode characters without overlap - The lexer can deterministically choose the appropriate token type ### Impact **Before:** - ⚠️ Build generates lexical nondeterminism warnings - ⚠️ Potential for inconsistent tokenization of Unicode digits in region names **After:** - ✅ Clean build without lexical warnings - ✅ Deterministic tokenization of Unicode characters - ✅ No functional impact on OQL query parsing - ✅ Maintains full backward compatibility ### Testing - Verified that `:geode-core:generateGrammarSource` completes without the lexical nondeterminism warning - No impact on existing OQL functionality as this only affects internal lexer disambiguation - Unicode digit handling in region names is now consistent and predictable ### Files Changed - `oql.g` > **Note:** This change only affects the internal lexer behavior and has no impact on OQL query syntax or semantics. All existing queries will continue to work exactly as before. <!-- Thank you for submitting a contribution to Apache Geode. --> <!-- In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: --> ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Has your PR been rebased against the latest commit within the target branch (typically `develop`)? - [x] Is your initial contribution a single, squashed commit? - [x] Does `gradlew build` run cleanly? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? <!-- Note: Please ensure that once the PR is submitted, check Concourse for build issues and submit an update to your PR as soon as possible. If you need help, please send an email to d...@geode.apache.org. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org