[
https://issues.apache.org/jira/browse/CALCITE-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938650#comment-17938650
]
Stamatis Zampetakis commented on CALCITE-6915:
----------------------------------------------
The motivation for this generalization stems from CALCITE-6916.
> Generalize terminology Linter to allow pattern based checks in commit messages
> ------------------------------------------------------------------------------
>
> Key: CALCITE-6915
> URL: https://issues.apache.org/jira/browse/CALCITE-6915
> Project: Calcite
> Issue Type: Improvement
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Labels: pull-request-available
>
> CALCITE-6493 added some checks for enforcing certain terminology (mostly
> focused on DBMS systems) in commit messages. There are still though various
> terms that will not be captured by the existing checks. Consider, for
> instance the ["snowflake"
> term|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L378]
> and the following messages:
> # Add support for Snowflake dialect
> # Add support for snowflake dialect
> # Add support for snowFlake dialect
> # Add support for SnowFlake dialect
> Normally, only the first commit message should be valid since the accepted
> term is "Snowflake". The check flags correctly the case 2 as invalid but
> fails to capture the case 3 and 4.
> The current implementation is based on an exact match word pattern that would
> require every single casing permutation of the word snowflake to be added in
> the
> [map|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L71].
> This already happens to some extend for
> [MySQL|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L367]
> term that appears twice in the map.
> In some cases terminology rules may require more than just different casing
> rules. For this reason, I propose to generalize the terminology Linter to use
> a pattern based definition that allows to capture more than just one instance
> of a word and also extend the reference term to be a Set instead of a single
> entry.
> Some secondary improvements from the proposed generalization are:
> * the use of pre-compiled patterns that are instantiated only once
> * the switch from Map to List as the container of the rules for faster
> iteration and better readability
--
This message was sent by Atlassian Jira
(v8.20.10#820010)