Re: [PR] NIFI-12017 add ability to choose to output to single line for base32 base64 contents [nifi]

via GitHub Mon, 04 Mar 2024 07:23:11 -0800


dan-s1 commented on code in PR #8417:
URL: https://github.com/apache/nifi/pull/8417#discussion_r1511339486



##########
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/EncodeContent.java:
##########
@@ -29,55 +37,88 @@
 import org.apache.nifi.annotation.documentation.CapabilityDescription;
 import org.apache.nifi.annotation.documentation.Tags;
 import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.Validator;
+import org.apache.nifi.expression.ExpressionLanguageScope;
 import org.apache.nifi.flowfile.FlowFile;
 import org.apache.nifi.processor.AbstractProcessor;
 import org.apache.nifi.processor.ProcessContext;
 import org.apache.nifi.processor.ProcessSession;
 import org.apache.nifi.processor.Relationship;
 import org.apache.nifi.processor.io.StreamCallback;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.processors.standard.encoding.EncodingMode;
+import org.apache.nifi.processors.standard.encoding.EncodingType;
+import org.apache.nifi.processors.standard.encoding.LineOutputMode;
 import org.apache.nifi.processors.standard.util.ValidatingBase32InputStream;
 import org.apache.nifi.processors.standard.util.ValidatingBase64InputStream;
 import org.apache.nifi.stream.io.StreamUtils;
 import org.apache.nifi.util.StopWatch;
 
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import java.util.Arrays;
-import java.util.List;
-import java.util.Set;
-import java.util.concurrent.TimeUnit;
-
 @SideEffectFree
 @SupportsBatching
 @InputRequirement(Requirement.INPUT_REQUIRED)
 @Tags({"encode", "decode", "base64", "base32", "hex"})
 @CapabilityDescription("Encode or decode the contents of a FlowFile using 
Base64, Base32, or hex encoding schemes")
 public class EncodeContent extends AbstractProcessor {
 
-    public static final String ENCODE_MODE = "Encode";
-    public static final String DECODE_MODE = "Decode";
-
-    public static final String BASE64_ENCODING = "base64";
-    public static final String BASE32_ENCODING = "base32";
-    public static final String HEX_ENCODING = "hex";
-
     public static final PropertyDescriptor MODE = new 
PropertyDescriptor.Builder()
             .name("Mode")
             .description("Specifies whether the content should be encoded or 
decoded")
             .required(true)
-            .allowableValues(ENCODE_MODE, DECODE_MODE)
-            .defaultValue(ENCODE_MODE)
+            .allowableValues(EncodingMode.class)
+            .defaultValue(EncodingMode.ENCODE)
             .build();
 
     public static final PropertyDescriptor ENCODING = new 
PropertyDescriptor.Builder()
             .name("Encoding")
             .description("Specifies the type of encoding used")
             .required(true)
-            .allowableValues(BASE64_ENCODING, BASE32_ENCODING, HEX_ENCODING)
-            .defaultValue(BASE64_ENCODING)
+            .allowableValues(EncodingType.class)
+            .defaultValue(EncodingType.BASE64_ENCODING)
+            .build();
+
+    static final PropertyDescriptor LINE_OUTPUT_MODE = new 
PropertyDescriptor.Builder()
+            .name("Line Output Mode")
+            .displayName("Line Output Mode")
+            .description("If set to 'single-line', the encoded FlowFile 
content will output as a single line. If set to 'multiple-lines', "
+                + "it will output as multiple lines. This property is only 
applicable when Base64 or Base32 encoding is selected.")
+            .required(false)
+            .defaultValue(LineOutputMode.SINGLE_LINE)
+            .allowableValues(LineOutputMode.class)
+            .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+            .dependsOn(MODE, EncodingMode.ENCODE)
+            .dependsOn(ENCODING, EncodingType.BASE64_ENCODING, 
EncodingType.BASE32_ENCODING)
             .build();
 
+    static final PropertyDescriptor ENCODED_LINE_SEPARATOR = new 
PropertyDescriptor.Builder()
+        .name("Encoded Content Line Separator")
+        .displayName("Encoded Content Line Separator")
+        .description("Each line of encoded data will be terminated with this 
byte sequence (e.g. \\r\\n"
+                + "). This property defaults to the system-dependent line 
separator string.  If `line-length` <= 0, "
+                + "the `line-separator` property is not used. This property is 
not used for `hex` encoding.")
+        .required(false)
+        
.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)
+        .defaultValue(System.lineSeparator())
+        .addValidator(Validator.VALID)
+        .dependsOn(MODE, EncodingMode.ENCODE)
+        .dependsOn(ENCODING, EncodingType.BASE64_ENCODING, 
EncodingType.BASE32_ENCODING)
+        .build();

Review Comment:
   @knguyen1 Thanks for pointing out those classes.
   @exceptionfactory  Based on the above classes there is another property 
which I am not sure should be exposed or not and that is the decoding policy. 
There are two options lenient and strict. The default is lenient. The wording 
from the `Base32OutputStream `Javadoc is
   
   > You can set the decoding behavior when the input bytes contain leftover 
trailing bits that cannot be created by a valid encoding. These can be bits 
that are unused from the final character or entire characters. The default mode 
is lenient decoding.
   > 
   > Lenient: Any trailing bits are composed into 8-bit bytes where possible. 
The remainder are discarded.
   > Strict: The decoding will raise an 
[IllegalArgumentException](https://docs.oracle.com/javase/8/docs/api/java/lang/IllegalArgumentException.html)
 if trailing bits are not part of a valid encoding. Any unused bits from the 
final character must be zero. Impossible counts of entire final characters are 
not allowed.
   > When strict decoding is enabled it is expected that the decoded bytes will 
be re-encoded to a byte array that matches the original, i.e. no changes occur 
on the final character. This requires that the input bytes use the same padding 
and alphabet as the encoder.
   
   Do we just default always to lenient or do we give users the ability to 
choose 'Strict'?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] NIFI-12017 add ability to choose to output to single line for base32 base64 contents [nifi]

Reply via email to