RAVINARAYAN SINGH created NIFI-11858:
----------------------------------------

             Summary: Improve column name normalization in PutDatabaseRecord 
processor
                 Key: NIFI-11858
                 URL: https://issues.apache.org/jira/browse/NIFI-11858
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: RAVINARAYAN SINGH
            Assignee: RAVINARAYAN SINGH


**Current Behavior:**
The current behavior of the PutDatabaseRecord processor when 
`column_translation` is set to true involves removing underscores ("_") from 
column names and replacing them with an empty string. This results in column 
names like "Pic_1_11" and "Pic_11_1" being considered the same, which may not 
be desired, especially in databases that allow underscores as valid characters 
in column names.

**Proposed Improvements:**
To address this issue and provide users with more control over the 
normalization process, we propose the following improvements:

1. Allow users to specify their own regex expression: Instead of hard-coding 
the normalization behavior, we can enhance the function by allowing users to 
pass a custom regex expression as the `column_translation` parameter. This way, 
advanced users can define their specific normalization rules based on their 
database requirements.

2. Predefined normalization options: To simplify the process for users who 
don't want to create their own regex expressions, we can provide some 
well-defined translation options, such as:
   a. REMOVE_UNDERSCORE: This option will remove all underscores from the 
column names.
   b. REMOVE_ALL_SPECIAL_CHAR: This option will remove all special characters 
(non-alphanumeric and non-space characters) from the column names.
   c. REMOVE_SPACE: This option will remove all spaces from the column names.

**Expected Behavior:**
With these improvements, users will have more flexibility and control over the 
normalization process when using the PutDatabaseRecord processor. They can 
either choose predefined normalization options or specify their custom regex 
expression to suit their specific database requirements.

**Note:**
This improvement will enhance the usability and compatibility of the 
PutDatabaseRecord processor with various database systems that have different 
rules for column name normalization.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to