[ 
https://issues.apache.org/jira/browse/CALCITE-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-6915:
------------------------------------
    Labels: pull-request-available  (was: )

> Generalize terminology Linter to allow pattern based checks in commit messages
> ------------------------------------------------------------------------------
>
>                 Key: CALCITE-6915
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6915
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>
> CALCITE-6493 added some checks for enforcing certain terminology (mostly 
> focused on DBMS systems) in commit messages. There are still though various 
> terms that will not be captured by the existing checks. Consider, for 
> instance the ["snowflake" 
> term|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L378]
>  and the following messages:
>  # Add support for Snowflake dialect
>  # Add support for snowflake dialect
>  # Add support for snowFlake dialect
>  # Add support for SnowFlake dialect
> Normally, only the first commit message should be valid since the accepted 
> term is "Snowflake". The check flags correctly the case 2 as invalid but 
> fails to capture the case 3 and 4.
> The current implementation is based on an exact match word pattern that would 
> require every single casing permutation of the word snowflake to be added in 
> the 
> [map|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L71].
>  This already happens to some extend for 
> [MySQL|https://github.com/apache/calcite/blob/bfbe8930f4ed7ba8da530e862e212a057191cfa3/core/src/test/java/org/apache/calcite/test/LintTest.java#L367]
>  term that appears twice in the map.
> In some cases terminology rules may require more than just different casing 
> rules. For this reason, I propose to generalize the terminology Linter to use 
> a pattern based definition that allows to capture more than just one instance 
> of a word and also extend the reference term to be a Set instead of a single 
> entry.
> Some secondary improvements from the proposed generalization are:
>  * the use of pre-compiled patterns that are instantiated only once
>  * the switch from Map to List as the container of the rules for faster 
> iteration and better readability



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to