Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-14 Thread via GitHub


exceptionfactory closed pull request #8493: NIFI-12887 Additional ways for 
handling Strings as binary in PutDatabaseRecord
URL: https://github.com/apache/nifi/pull/8493


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-14 Thread via GitHub


exceptionfactory commented on PR #8493:
URL: https://github.com/apache/nifi/pull/8493#issuecomment-1997607252

   > > The last item to address it the `required()` status for the new 
property. As mentioned in the comment, it can be set to `true` since the 
default value provided maintains compatibility with existing configuration.
   > 
   > Technically we could make it required, but we don't automatically do that 
in such cases. When a property is essential on a conceptual level, we make it 
required but leave it optional otherwise, even if it has a default value.
   > 
   > This is more about UX but it makes sense. This can't be considered a 
conceptually essential parameter, like 'Table Name' for example. Let's leave it 
optional.
   
   Thanks for the reply.
   
   Although use of the property is optional in one respect, in that it does not 
apply if the record does not relate to a binary column, it is required in terms 
of understanding the default behavior.
   
   Right now, the default behavior is implicit, always using UTF-8 decoding. 
From another angle, making the property optional leaves the behavior ambiguous, 
because the property value could be an empty string. In that case, it could 
fall back to UTF-8, but that is less clear than marking the property as 
required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-14 Thread via GitHub


tpalfy commented on PR #8493:
URL: https://github.com/apache/nifi/pull/8493#issuecomment-1997388520

   > The last item to address it the `required()` status for the new property. 
As mentioned in the comment, it can be set to `true` since the default value 
provided maintains compatibility with existing configuration.
   
   Technically we could make it required, but we don't automatically do that in 
such cases. When a property is essential on a conceptual level, we make it 
required but leave it optional otherwise, even if it has a default value.
   
   This is more about UX but it makes sense. This can't be considered a 
conceptually essential parameter, like 'Table Name' for example. Let's leave it 
optional.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-13 Thread via GitHub


pvillard31 commented on PR #8493:
URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994840950

   Yeah the option of opening another PR for 1.x is also OK :) As a FYI, from 
the Github UI, on the PR, you can change the branch against which the PR is 
submitted, that is why I suggested to use this one against 1.x and open a new 
one with the changes. But all good, the end result is the same ;)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-13 Thread via GitHub


exceptionfactory commented on PR #8493:
URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994733725

   > > @tpalfy - I think the easiest is to keep this PR for support/1.x, and 
open another PR targeted at main that does not use Google library. I think it's 
safer compared to letting the merger do the changes. Happy to merge once we're 
on the same page.
   > 
   > This PR is based on and is open against `main` already. So I'll need to 
change this one and open a new one against 1.x.
   
   Following up here from the thread, yes, I recommend making the changes on 
this pull request to use the native Java components, and then create a separate 
PR for the support branch. Java 8 includes Base64 support, so that should be 
used there, and Apache Commons Codec provides a Hex utility class that can 
handle hexadecimal decoding.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-13 Thread via GitHub


tpalfy commented on PR #8493:
URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994675042

   > @tpalfy - I think the easiest is to keep this PR for support/1.x, and open 
another PR targeted at main that does not use Google library. I think it's 
safer compared to letting the merger do the changes. Happy to merge once we're 
on the same page.
   
   This PR is based on and is open against `main` already. So I'll need to 
change this one and open a new one against 1.x.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-13 Thread via GitHub


pvillard31 commented on PR #8493:
URL: https://github.com/apache/nifi/pull/8493#issuecomment-1994603126

   @tpalfy - I think the easiest is to keep this PR for support/1.x, and open 
another PR targeted at main that does not use Google library. I think it's 
safer compared to letting the merger do the changes. Happy to merge once we're 
on the same page.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-13 Thread via GitHub


tpalfy commented on code in PR #8493:
URL: https://github.com/apache/nifi/pull/8493#discussion_r1523369229


##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, 
final ProcessSession sessi
 }
 currentValue = dest;
 } else if (currentValue instanceof 
String) {
-currentValue = ((String) 
currentValue).getBytes(StandardCharsets.UTF_8);
+final String stringValue = 
(String) currentValue;
+
+if 
(BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base64().decode(stringValue);
+} else if 
(BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base16().decode(stringValue.toUpperCase());

Review Comment:
   Yes, I'm thinking about having this change on the 1.x branch.
   
   So not sure what is your suggestion. If we want it on 1.x branch is it okay 
to use the Google library on main and 1.x?
   
   Or you suggest it's worth the additional work to use vanilla Java on main 
and instead of simply doing a cherry-pick, the committer should do the code 
rewrite during backport to 1.x?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-13 Thread via GitHub


tpalfy commented on code in PR #8493:
URL: https://github.com/apache/nifi/pull/8493#discussion_r1523369229


##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, 
final ProcessSession sessi
 }
 currentValue = dest;
 } else if (currentValue instanceof 
String) {
-currentValue = ((String) 
currentValue).getBytes(StandardCharsets.UTF_8);
+final String stringValue = 
(String) currentValue;
+
+if 
(BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base64().decode(stringValue);
+} else if 
(BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base16().decode(stringValue.toUpperCase());

Review Comment:
   Yes, I'm thinking about having this change on the 1.x branch.
   
   So not sure what is your suggestion. If we want it on 1.x branch is it okay 
to use the Google library?
   
   Or you suggest it's worth the additional work to use vanilla Java on main 
and instead of simply doing a cherry-pick, the committer should do the code 
rewrite during backport to 1.x?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-12 Thread via GitHub


exceptionfactory commented on code in PR #8493:
URL: https://github.com/apache/nifi/pull/8493#discussion_r1521755668


##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -238,6 +239,34 @@ public class PutDatabaseRecord extends AbstractProcessor {
 .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
 .build();
 
+static final AllowableValue BINARY_STRING_FORMAT_UTF8 = new AllowableValue(
+"UTF8",
+"UTF8",
+"String values for binary columns contain the original value as 
text via UTF-8 character encoding"
+);
+
+static final AllowableValue BINARY_STRING_FORMAT_HEX_STRING = new 
AllowableValue(
+"Hexadecimal",
+"Hexadecimal",
+"String values for binary columns contain the original value in 
hexadecimal format"
+);
+
+static final AllowableValue BINARY_STRING_FORMAT_BASE64 = new 
AllowableValue(
+"Base64",
+"Base64",
+"String values for binary columns contain the original value in 
Base64 encoded format"
+);
+
+static final PropertyDescriptor BINARY_STRING_FORMAT = new Builder()
+.name("put-db-record-binary-format")
+.displayName("Binary String Format")
+.description("The format to be applied when decoding string values 
to binary.")
+.required(false)

Review Comment:
   One additional note, since this new property has a default value that 
preserves current behavior, this property can be marked as required.
   ```suggestion
   .required(true)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-12 Thread via GitHub


exceptionfactory commented on code in PR #8493:
URL: https://github.com/apache/nifi/pull/8493#discussion_r1521652062


##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, 
final ProcessSession sessi
 }
 currentValue = dest;
 } else if (currentValue instanceof 
String) {
-currentValue = ((String) 
currentValue).getBytes(StandardCharsets.UTF_8);
+final String stringValue = 
(String) currentValue;
+
+if 
(BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base64().decode(stringValue);
+} else if 
(BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base16().decode(stringValue.toUpperCase());

Review Comment:
   Java 8 support is not a requirement for the main branch, and HexFormat was 
introduced in Java 17, so this should be the direction for future integration. 
If the goal is to support this capability for the 1.x branch, then the feature 
could be backported and the Google library could be used. Going forward, 
however, we should make use of available Java capabilities as opposed to 
placing more dependencies on Google Guava or other libraries.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-12 Thread via GitHub


tpalfy commented on code in PR #8493:
URL: https://github.com/apache/nifi/pull/8493#discussion_r1521646045


##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, 
final ProcessSession sessi
 }
 currentValue = dest;
 } else if (currentValue instanceof 
String) {
-currentValue = ((String) 
currentValue).getBytes(StandardCharsets.UTF_8);
+final String stringValue = 
(String) currentValue;
+
+if 
(BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base64().decode(stringValue);
+} else if 
(BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base16().decode(stringValue.toUpperCase());

Review Comment:
   I didn't see a convenient Java8 compatible way of using HexFormat. The 
Google dependency is already available, all things considered using it here for 
both Hex and Base64 is optimal in my opinion. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-12 Thread via GitHub


exceptionfactory commented on code in PR #8493:
URL: https://github.com/apache/nifi/pull/8493#discussion_r1521441171


##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, 
final ProcessSession sessi
 }
 currentValue = dest;
 } else if (currentValue instanceof 
String) {
-currentValue = ((String) 
currentValue).getBytes(StandardCharsets.UTF_8);
+final String stringValue = 
(String) currentValue;
+
+if 
(BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base64().decode(stringValue);

Review Comment:
   The standard Java 
[Base64](https://docs.oracle.com/en/java/javase/21/docs//api/java.base/java/util/Base64.html)
 class should be used instead of the the Google class.



##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -238,6 +239,34 @@ public class PutDatabaseRecord extends AbstractProcessor {
 .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
 .build();
 
+static final AllowableValue BINARY_STRING_FORMAT_UTF8 = new AllowableValue(
+"UTF8",
+"UTF8",
+"String values for binary columns contain the original value as 
text via UTF-8 character encoding"
+);
+
+static final AllowableValue BINARY_STRING_FORMAT_HEX_STRING = new 
AllowableValue(
+"Hex String",
+"Hex String",
+"String values for binary columns contain the original value in 
hexadecimal format"
+);
+
+static final AllowableValue BINARY_STRING_FORMAT_BASE64 = new 
AllowableValue(
+"Base64-encoded String",
+"Base64-encoded String",
+"String values for binary columns contain the original value in 
Base64 encoded format"
+);
+
+static final PropertyDescriptor BINARY_STRING_FORMAT = new Builder()
+.name("put-db-record-binary-format")
+.displayName("Binary String Format")
+.description("How to interpret string data for binary columns.")

Review Comment:
   Recommend adjusting the description for clarity to indicate decoding.
   ```suggestion
   .description("The format to be applied when decoding string 
values to binary.")
   ```



##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -764,7 +796,15 @@ private void executeDML(final ProcessContext context, 
final ProcessSession sessi
 }
 currentValue = dest;
 } else if (currentValue instanceof 
String) {
-currentValue = ((String) 
currentValue).getBytes(StandardCharsets.UTF_8);
+final String stringValue = 
(String) currentValue;
+
+if 
(BINARY_STRING_FORMAT_BASE64.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base64().decode(stringValue);
+} else if 
(BINARY_STRING_FORMAT_HEX_STRING.getValue().equals(binaryStringFormat)) {
+currentValue = 
BaseEncoding.base16().decode(stringValue.toUpperCase());

Review Comment:
   The standard Java 
[HexFormat](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/HexFormat.html)
 class should be used instead of the Google class.



##
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/PutDatabaseRecord.java:
##
@@ -238,6 +239,34 @@ public class PutDatabaseRecord extends AbstractProcessor {
 .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
 .build();
 
+static final AllowableValue BINARY_STRING_FORMAT_UTF8 = new AllowableValue(
+"UTF8",
+"UTF8",
+"String values for binary columns contain the original value as 
text via UTF-8 character encoding"
+);
+
+static final AllowableValue BINARY_STRING_FORMAT_HEX_STRING = new 
AllowableValue(
+"Hex String",
+"Hex String",

Review Comment:
   Recommend using `Hexadecimal` to align with other options.
   ```suggestion
   "Hexadecimal",
   

[PR] NIFI-12887 Additional ways for handling Strings as binary in PutDatabaseRecord [nifi]

2024-03-12 Thread via GitHub


tpalfy opened a new pull request, #8493:
URL: https://github.com/apache/nifi/pull/8493

   Add option to treat Strings as hexadecimal character sequences or 
base64-encoded binary data when inserting into a binary type column.
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   # Summary
   
   [NIFI-12887](https://issues.apache.org/jira/browse/NIFI-12887)
   
   # Tracking
   
   Please complete the following tracking steps prior to pull request creation.
   
   ### Issue Tracking
   
   - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue 
created
   
   ### Pull Request Tracking
   
   - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as 
`NIFI-0`
   - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, 
as such `NIFI-0`
   
   ### Pull Request Formatting
   
   - [ ] Pull Request based on current revision of the `main` branch
   - [ ] Pull Request refers to a feature branch with one commit containing 
changes
   
   # Verification
   
   Please indicate the verification steps performed prior to pull request 
creation.
   
   ### Build
   
   - [ ] Build completed using `mvn clean install -P contrib-check`
 - [ ] JDK 21
   
   ### Licensing
   
   - [ ] New dependencies are compatible with the [Apache License 
2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License 
Policy](https://www.apache.org/legal/resolved.html)
   - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` 
files
   
   ### Documentation
   
   - [ ] Documentation formatting appears as expected in rendered files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org