Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
exceptionfactory closed pull request #8493: NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord URL: https://github.com/apache/nifi/pull/8493 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
exceptionfactory commented on PR #8493: URL: https://github.com/apache/nifi/pull/8493#issuecomment-1997607252 > > The last item to address it the `required()` status for the new property. As mentioned in the comment, it can be set to `true` since the default value provided maintains compatibility with existing configuration. > > Technically we could make it required, but we don't automatically do that in such cases. When a property is essential on a conceptual level, we make it required but leave it optional otherwise, even if it has a default value. > > This is more about UX but it makes sense. This can't be considered a conceptually essential parameter, like 'Table Name' for example. Let's leave it optional. Thanks for the reply. Although use of the property is optional in one respect, in that it does not apply if the record does not relate to a binary column, it is required in terms of understanding the default behavior. Right now, the default behavior is implicit, always using UTF-8 decoding. From another angle, making the property optional leaves the behavior ambiguous, because the property value could be an empty string. In that case, it could fall back to UTF-8, but that is less clear than marking the property as required. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
tpalfy commented on PR #8493: URL: https://github.com/apache/nifi/pull/8493#issuecomment-1997388520 > The last item to address it the `required()` status for the new property. As mentioned in the comment, it can be set to `true` since the default value provided maintains compatibility with existing configuration. Technically we could make it required, but we don't automatically do that in such cases. When a property is essential on a conceptual level, we make it required but leave it optional otherwise, even if it has a default value. This is more about UX but it makes sense. This can't be considered a conceptually essential parameter, like 'Table Name' for example. Let's leave it optional. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
pvillard31 commented on PR #8493: URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994840950 Yeah the option of opening another PR for 1.x is also OK :) As a FYI, from the Github UI, on the PR, you can change the branch against which the PR is submitted, that is why I suggested to use this one against 1.x and open a new one with the changes. But all good, the end result is the same ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
exceptionfactory commented on PR #8493: URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994733725 > > @tpalfy - I think the easiest is to keep this PR for support/1.x, and open another PR targeted at main that does not use Google library. I think it's safer compared to letting the merger do the changes. Happy to merge once we're on the same page. > > This PR is based on and is open against `main` already. So I'll need to change this one and open a new one against 1.x. Following up here from the thread, yes, I recommend making the changes on this pull request to use the native Java components, and then create a separate PR for the support branch. Java 8 includes Base64 support, so that should be used there, and Apache Commons Codec provides a Hex utility class that can handle hexadecimal decoding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
tpalfy commented on PR #8493: URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994675042 > @tpalfy - I think the easiest is to keep this PR for support/1.x, and open another PR targeted at main that does not use Google library. I think it's safer compared to letting the merger do the changes. Happy to merge once we're on the same page. This PR is based on and is open against `main` already. So I'll need to change this one and open a new one against 1.x. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
pvillard31 commented on PR #8493: URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994603126 @tpalfy - I think the easiest is to keep this PR for support/1.x, and open another PR targeted at main that does not use Google library. I think it's safer compared to letting the merger do the changes. Happy to merge once we're on the same page. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
tpalfy commented on code in PR #8493: URL: https://github.com/apache/nifi/pull/8493#discussion_r1523369229 ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, final ProcessSession sessi } currentValue = dest; } else if (currentValue instanceof String) { -currentValue = ((String) currentValue).getBytes(StandardCharsets.UTF_8); +final String stringValue = (String) currentValue; + +if (BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base64().decode(stringValue); +} else if (BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base16().decode(stringValue.toUpperCase()); Review Comment: Yes, I'm thinking about having this change on the 1.x branch. So not sure what is your suggestion. If we want it on 1.x branch is it okay to use the Google library on main and 1.x? Or you suggest it's worth the additional work to use vanilla Java on main and instead of simply doing a cherry-pick, the committer should do the code rewrite during backport to 1.x? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
tpalfy commented on code in PR #8493: URL: https://github.com/apache/nifi/pull/8493#discussion_r1523369229 ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, final ProcessSession sessi } currentValue = dest; } else if (currentValue instanceof String) { -currentValue = ((String) currentValue).getBytes(StandardCharsets.UTF_8); +final String stringValue = (String) currentValue; + +if (BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base64().decode(stringValue); +} else if (BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base16().decode(stringValue.toUpperCase()); Review Comment: Yes, I'm thinking about having this change on the 1.x branch. So not sure what is your suggestion. If we want it on 1.x branch is it okay to use the Google library? Or you suggest it's worth the additional work to use vanilla Java on main and instead of simply doing a cherry-pick, the committer should do the code rewrite during backport to 1.x? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
exceptionfactory commented on code in PR #8493: URL: https://github.com/apache/nifi/pull/8493#discussion_r1521755668 ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -238,6 +239,34 @@ public class PutDatabaseRecord extends AbstractProcessor { .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); +static final AllowableValue BINARY_STRING_FORMAT_UTF8 = new AllowableValue( +"UTF8", +"UTF8", +"String values for binary columns contain the original value as text via UTF-8 character encoding" +); + +static final AllowableValue BINARY_STRING_FORMAT_HEX_STRING = new AllowableValue( +"Hexadecimal", +"Hexadecimal", +"String values for binary columns contain the original value in hexadecimal format" +); + +static final AllowableValue BINARY_STRING_FORMAT_BASE64 = new AllowableValue( +"Base64", +"Base64", +"String values for binary columns contain the original value in Base64 encoded format" +); + +static final PropertyDescriptor BINARY_STRING_FORMAT = new Builder() +.name("put-db-record-binary-format") +.displayName("Binary String Format") +.description("The format to be applied when decoding string values to binary.") +.required(false) Review Comment: One additional note, since this new property has a default value that preserves current behavior, this property can be marked as required. ```suggestion .required(true) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
exceptionfactory commented on code in PR #8493: URL: https://github.com/apache/nifi/pull/8493#discussion_r1521652062 ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, final ProcessSession sessi } currentValue = dest; } else if (currentValue instanceof String) { -currentValue = ((String) currentValue).getBytes(StandardCharsets.UTF_8); +final String stringValue = (String) currentValue; + +if (BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base64().decode(stringValue); +} else if (BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base16().decode(stringValue.toUpperCase()); Review Comment: Java 8 support is not a requirement for the main branch, and HexFormat was introduced in Java 17, so this should be the direction for future integration. If the goal is to support this capability for the 1.x branch, then the feature could be backported and the Google library could be used. Going forward, however, we should make use of available Java capabilities as opposed to placing more dependencies on Google Guava or other libraries. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
tpalfy commented on code in PR #8493: URL: https://github.com/apache/nifi/pull/8493#discussion_r1521646045 ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, final ProcessSession sessi } currentValue = dest; } else if (currentValue instanceof String) { -currentValue = ((String) currentValue).getBytes(StandardCharsets.UTF_8); +final String stringValue = (String) currentValue; + +if (BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base64().decode(stringValue); +} else if (BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base16().decode(stringValue.toUpperCase()); Review Comment: I didn't see a convenient Java8 compatible way of using HexFormat. The Google dependency is already available, all things considered using it here for both Hex and Base64 is optimal in my opinion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
exceptionfactory commented on code in PR #8493: URL: https://github.com/apache/nifi/pull/8493#discussion_r1521441171 ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, final ProcessSession sessi } currentValue = dest; } else if (currentValue instanceof String) { -currentValue = ((String) currentValue).getBytes(StandardCharsets.UTF_8); +final String stringValue = (String) currentValue; + +if (BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base64().decode(stringValue); Review Comment: The standard Java [Base64](https://docs.oracle.com/en/java/javase/21/docs//api/java.base/java/util/Base64.html) class should be used instead of the the Google class. ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -238,6 +239,34 @@ public class PutDatabaseRecord extends AbstractProcessor { .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); +static final AllowableValue BINARY_STRING_FORMAT_UTF8 = new AllowableValue( +"UTF8", +"UTF8", +"String values for binary columns contain the original value as text via UTF-8 character encoding" +); + +static final AllowableValue BINARY_STRING_FORMAT_HEX_STRING = new AllowableValue( +"Hex String", +"Hex String", +"String values for binary columns contain the original value in hexadecimal format" +); + +static final AllowableValue BINARY_STRING_FORMAT_BASE64 = new AllowableValue( +"Base64-encoded String", +"Base64-encoded String", +"String values for binary columns contain the original value in Base64 encoded format" +); + +static final PropertyDescriptor BINARY_STRING_FORMAT = new Builder() +.name("put-db-record-binary-format") +.displayName("Binary String Format") +.description("How to interpret string data for binary columns.") Review Comment: Recommend adjusting the description for clarity to indicate decoding. ```suggestion .description("The format to be applied when decoding string values to binary.") ``` ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, final ProcessSession sessi } currentValue = dest; } else if (currentValue instanceof String) { -currentValue = ((String) currentValue).getBytes(StandardCharsets.UTF_8); +final String stringValue = (String) currentValue; + +if (BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base64().decode(stringValue); +} else if (BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) { +currentValue = BaseEncoding.base16().decode(stringValue.toUpperCase()); Review Comment: The standard Java [HexFormat](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/HexFormat.html) class should be used instead of the Google class. ## nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java: ## @@ -238,6 +239,34 @@ public class PutDatabaseRecord extends AbstractProcessor { .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); +static final AllowableValue BINARY_STRING_FORMAT_UTF8 = new AllowableValue( +"UTF8", +"UTF8", +"String values for binary columns contain the original value as text via UTF-8 character encoding" +); + +static final AllowableValue BINARY_STRING_FORMAT_HEX_STRING = new AllowableValue( +"Hex String", +"Hex String", Review Comment: Recommend using `Hexadecimal` to align with other options. ```suggestion "Hexadecimal",
[PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]
tpalfy opened a new pull request, #8493: URL: https://github.com/apache/nifi/pull/8493 Add option to treat Strings as hexadecimal character sequences or base64-encoded binary data when inserting into a binary type column. # Summary [NIFI-12887](https://issues.apache.org/jira/browse/NIFI-12887) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 21 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org