[
https://issues.apache.org/jira/browse/NIFI-11858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17905582#comment-17905582
]
ASF subversion and git services commented on NIFI-11858:
--------------------------------------------------------
Commit 502572b2f5911685a5baf8ce70a4c8f5f90b668b in nifi's branch
refs/heads/main from ravisingh
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=502572b2f5 ]
NIFI-11858 Configurable Column Name Normalization in PutDatabaseRecord and
UpdateDatabaseTable
cleaned and required changes for https://github.com/apache/nifi/pull/8995
updated the description to reflect uppercase conversion of column name
uppercased to do case-insensitive matching irrespective of strategy
added example for REMOVE_ALL_SPECIAL_CHAR and PATTERN
Signed-off-by: Matt Burgess <[email protected]>
This closes #9382
> Improve column name normalization in PutDatabaseRecord processor
> ----------------------------------------------------------------
>
> Key: NIFI-11858
> URL: https://issues.apache.org/jira/browse/NIFI-11858
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: RAVINARAYAN SINGH
> Assignee: RAVINARAYAN SINGH
> Priority: Minor
> Time Spent: 9h 20m
> Remaining Estimate: 0h
>
> *Current Behavior:*
> The current behavior of the PutDatabaseRecord processor when
> `column_translation` is set to true involves removing underscores ("_") from
> column names and replacing them with an empty string. This results in column
> names like "Pic_1_11" and "Pic_11_1" being considered the same, which may not
> be desired, especially in databases that allow underscores as valid
> characters in column names.
> *Proposed Improvements:*
> To address this issue and provide users with more control over the
> normalization process, we propose the following improvements:
> 1. Allow users to specify their own regex expression: Instead of hard-coding
> the normalization behavior, we can enhance the function by allowing users to
> pass a custom regex expression as the `column_translation` parameter. This
> way, advanced users can define their specific normalization rules based on
> their database requirements.
> 2. Predefined normalization options: To simplify the process for users who
> don't want to create their own regex expressions, we can provide some
> well-defined translation options, such as:
> a. REMOVE_UNDERSCORE: This option will remove all underscores from the
> column names.
> b. REMOVE_ALL_SPECIAL_CHAR: This option will remove all special characters
> (non-alphanumeric and non-space characters) from the column names.
> c. REMOVE_SPACE: This option will remove all spaces from the column names.
> *Expected Behavior:*
> With these improvements, users will have more flexibility and control over
> the normalization process when using the PutDatabaseRecord processor. They
> can either choose predefined normalization options or specify their custom
> regex expression to suit their specific database requirements.
> *Note:*
> This improvement will enhance the usability and compatibility of the
> PutDatabaseRecord processor with various database systems that have different
> rules for column name normalization.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)