[
https://issues.apache.org/jira/browse/NIFI-11858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Burgess updated NIFI-11858:
--------------------------------
Status: Patch Available (was: Open)
> Improve column name normalization in PutDatabaseRecord processor
> ----------------------------------------------------------------
>
> Key: NIFI-11858
> URL: https://issues.apache.org/jira/browse/NIFI-11858
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: RAVINARAYAN SINGH
> Assignee: RAVINARAYAN SINGH
> Priority: Minor
> Time Spent: 4h 20m
> Remaining Estimate: 0h
>
> *Current Behavior:*
> The current behavior of the PutDatabaseRecord processor when
> `column_translation` is set to true involves removing underscores ("_") from
> column names and replacing them with an empty string. This results in column
> names like "Pic_1_11" and "Pic_11_1" being considered the same, which may not
> be desired, especially in databases that allow underscores as valid
> characters in column names.
> *Proposed Improvements:*
> To address this issue and provide users with more control over the
> normalization process, we propose the following improvements:
> 1. Allow users to specify their own regex expression: Instead of hard-coding
> the normalization behavior, we can enhance the function by allowing users to
> pass a custom regex expression as the `column_translation` parameter. This
> way, advanced users can define their specific normalization rules based on
> their database requirements.
> 2. Predefined normalization options: To simplify the process for users who
> don't want to create their own regex expressions, we can provide some
> well-defined translation options, such as:
> a. REMOVE_UNDERSCORE: This option will remove all underscores from the
> column names.
> b. REMOVE_ALL_SPECIAL_CHAR: This option will remove all special characters
> (non-alphanumeric and non-space characters) from the column names.
> c. REMOVE_SPACE: This option will remove all spaces from the column names.
> *Expected Behavior:*
> With these improvements, users will have more flexibility and control over
> the normalization process when using the PutDatabaseRecord processor. They
> can either choose predefined normalization options or specify their custom
> regex expression to suit their specific database requirements.
> *Note:*
> This improvement will enhance the usability and compatibility of the
> PutDatabaseRecord processor with various database systems that have different
> rules for column name normalization.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)