tpalfy commented on code in PR #6018:
URL: https://github.com/apache/nifi/pull/6018#discussion_r865854290


##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeReader.java:
##########
@@ -95,6 +98,18 @@ public class JsonTreeReader extends SchemaRegistryService 
implements RecordReade
             .dependsOn(STARTING_FIELD_STRATEGY, 
StartingFieldStrategy.NESTED_FIELD.name())
             .build();
 
+    public static final PropertyDescriptor SCHEMA_APPLICATION_STRATEGY = new 
PropertyDescriptor.Builder()
+            .name("schema-application-strategy")
+            .displayName("Schema Application Strategy")
+            .description("Specifies whether the schema is defined for the 
whole JSON or for the selected part starting from \"Starting Field Name\".")
+            .required(true)
+            .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
+            .defaultValue(SchemaApplicationStrategy.SELECTED_PART.getValue())
+            .dependsOn(STARTING_FIELD_STRATEGY, 
StartingFieldStrategy.NESTED_FIELD.name())
+            .dependsOn(SCHEMA_ACCESS_STRATEGY, SCHEMA_NAME_PROPERTY, 
SCHEMA_TEXT_PROPERTY, HWX_SCHEMA_REF_ATTRIBUTES, HWX_CONTENT_ENCODED_SCHEMA, 
CONFLUENT_ENCODED_SCHEMA)

Review Comment:
   Why do we need to depend on SCHEMA_ACCESS_STRATEGY?



##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeReader.java:
##########
@@ -95,6 +98,18 @@ public class JsonTreeReader extends SchemaRegistryService 
implements RecordReade
             .dependsOn(STARTING_FIELD_STRATEGY, 
StartingFieldStrategy.NESTED_FIELD.name())
             .build();
 
+    public static final PropertyDescriptor SCHEMA_APPLICATION_STRATEGY = new 
PropertyDescriptor.Builder()

Review Comment:
   Suggestion: I would call this SCHEMA_SCOPE. I feel "Application Strategy" is 
a bit too technical.



##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/json/TestJsonTreeRowRecordReader.java:
##########
@@ -1128,29 +1140,46 @@ void testStartFromNestedFieldThenStartObject() throws 
IOException, MalformedReco
                 }})
         );
 
-        testReadRecords(jsonPath, expectedRecordSchema, expected, 
StartingFieldStrategy.NESTED_FIELD, "accounts");
+        testReadRecords(jsonPath, expectedRecordSchema, expected, 
StartingFieldStrategy.NESTED_FIELD,
+                "accounts", SchemaApplicationStrategy.SELECTED_PART);
+    }
+
+    @Test
+    void testStartsFromNestedObjectWithWholeJsonSchemaApplicationStrategy() 
throws IOException, MalformedRecordException {
+        String jsonPath = "src/test/resources/json/single-element-nested.json";
+
+        RecordSchema recordSchema = getSchema();
+
+        RecordSchema expectedRecordSchema = getAccountSchema();
+
+        List<Object> expected = Collections.singletonList(
+                new MapRecord(expectedRecordSchema, new HashMap<String, 
Object>() {{
+                    put("id", 42);
+                    put("balance", 4750.89);
+                }})
+        );
+
+        testReadRecords(jsonPath, recordSchema, expected, 
StartingFieldStrategy.NESTED_FIELD,
+                "account", SchemaApplicationStrategy.WHOLE_JSON);
+
     }

Review Comment:
   Unless the test data creation is really complicated, tests are easier to 
understand if the data creation is inlined (not done via extracted methods).
   
   Doing that, one can check the input, the metadata (schema in this case) and 
the expected output at one place.
   (Input is actually already extracted to file but I guess we can leave that.)
   
   It also helps avoiding making errors/confusions like in this case: the full 
schema is created with an "account" field and a "balance" field - even though 
_there is no_ balance field in the data at the root level.
   
   I would suggest this method done like this:
   ```java
       @Test
       void testStartsFromNestedObjectWithWholeJsonSchemaScope() throws 
IOException, MalformedRecordException {
           String jsonPath = 
"src/test/resources/json/single-element-nested.json";
   
           RecordSchema accountSchema = new SimpleRecordSchema(Arrays.asList(
               new RecordField("id", RecordFieldType.INT.getDataType()),
               new RecordField("balance", RecordFieldType.DOUBLE.getDataType())
           ));
   
           RecordSchema recordSchema = new SimpleRecordSchema(Arrays.asList(
               new RecordField("account", 
RecordFieldType.RECORD.getRecordDataType(accountSchema))
           ));
   
           RecordSchema expectedRecordSchema = accountSchema;
   
           List<Object> expected = Collections.singletonList(
                   new MapRecord(expectedRecordSchema, new HashMap<String, 
Object>() {{
                       put("id", 42);
                       put("balance", 4750.89);
                   }})
           );
   
           testReadRecords(jsonPath, recordSchema, expected, 
StartingFieldStrategy.NESTED_FIELD,
                   "account", SchemaScope.WHOLE_JSON);
       }
   ```
   Also I would add a case where the nested field is an array:
   ```java
       @Test
       void testStartsFromNestedArrayWithWholeJsonSchemaScope() throws 
IOException, MalformedRecordException {
           String jsonPath = 
"src/test/resources/json/single-element-nested-array.json";
   
           RecordSchema accountSchema = new SimpleRecordSchema(Arrays.asList(
               new RecordField("id", RecordFieldType.INT.getDataType()),
               new RecordField("balance", RecordFieldType.DOUBLE.getDataType())
           ));
   
           RecordSchema recordSchema = new SimpleRecordSchema(Arrays.asList(
               new RecordField("accounts", 
RecordFieldType.ARRAY.getArrayDataType(RecordFieldType.RECORD.getRecordDataType(accountSchema)))
           ));
   
           RecordSchema expectedRecordSchema = accountSchema;
   
           List<Object> expected = Arrays.asList(
                   new MapRecord(expectedRecordSchema, new HashMap<String, 
Object>() {{
                       put("id", 42);
                       put("balance", 4750.89);
                   }}),
                   new MapRecord(expectedRecordSchema, new HashMap<String, 
Object>() {{
                       put("id", 43);
                       put("balance", 48212.38);
                   }})
           );
   
           testReadRecords(jsonPath, recordSchema, expected, 
StartingFieldStrategy.NESTED_FIELD,
                   "accounts", SchemaScope.WHOLE_JSON);
       }
   ```



##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/SchemaApplicationStrategy.java:
##########
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.json;
+
+import org.apache.nifi.components.DescribedValue;
+
+public enum SchemaApplicationStrategy implements DescribedValue {
+    WHOLE_JSON {
+        @Override
+        public String getDisplayName() {
+            return "Whole JSON";
+        }
+
+        @Override
+        public String getDescription() {
+            return "Applies the schema for the whole JSON.";
+        }
+    },
+    SELECTED_PART {
+        @Override
+        public String getDisplayName() {
+            return "Selected Part";
+        }
+
+        @Override
+        public String getDescription() {
+            return "Applies the schema for the selected part starting from the 
\"Starting Field Name\".";
+        }
+    };
+
+    @Override
+    public String getValue() {
+        return name();
+    }
+}

Review Comment:
   ```suggestion
   public enum SchemaApplicationStrategy implements DescribedValue {
       WHOLE_JSON(
           "Whole JSON",
           "Applies the schema for the whole JSON."
       ),
       SELECTED_PART(
           "Selected Part",
           "Applies the schema for the selected part starting from the 
\"Starting Field Name\"."
       );
   
       private final String displayName;
       private final String description;
   
       SchemaApplicationStrategy(String displayName, String description) {
           this.displayName = displayName;
           this.description = description;
       }
   
       @Override
       public String getDisplayName() {
           return displayName;
       }
   
       @Override
       public String getDescription() {
           return description;
       }
   
       @Override
       public String getValue() {
           return name();
       }
   }
   ```



##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/JsonTreeRowRecordReader.java:
##########
@@ -48,26 +48,48 @@ public class JsonTreeRowRecordReader extends 
AbstractJsonRowRecordReader {
 
 
     public JsonTreeRowRecordReader(final InputStream in, final ComponentLog 
logger, final RecordSchema schema,

Review Comment:
   I see with [NIFI-9862](https://issues.apache.org/jira/browse/NIFI-9862) we 
added independent constructors for both _JsonTreeReader_ and 
_AbstractJsonRowRecordReader_.
   
   That is ill-advised. A constructor hierarchy should have as little branching 
as possible, preferably none.
   I recommend the following changes:
   
   **JsonTreeRowRecordReader**
   ```java
       public JsonTreeRowRecordReader(final InputStream in, final ComponentLog 
logger, final RecordSchema schema,
                                      final String dateFormat, final String 
timeFormat, final String timestampFormat) throws IOException, 
MalformedRecordException {
           this(in, logger, schema, dateFormat, timeFormat, timestampFormat, 
null, null, null);
       }
   ```
   instead of
   ```java
       public JsonTreeRowRecordReader(final InputStream in, final ComponentLog 
logger, final RecordSchema schema,
                                      final String dateFormat, final String 
timeFormat, final String timestampFormat) throws IOException, 
MalformedRecordException {
           super(in, logger, dateFormat, timeFormat, timestampFormat);
           this.schema = schema;
       }
   ```
   
   **AbstractJsonRowRecordReader**
   ```java
       protected AbstractJsonRowRecordReader(final InputStream in, final 
ComponentLog logger, final String dateFormat, final String timeFormat, final 
String timestampFormat)
               throws IOException, MalformedRecordException {
   
           this(in, logger, dateFormat, timeFormat, timestampFormat, null, 
null);
       }
   ```
   instead of
   ```java
       protected AbstractJsonRowRecordReader(final InputStream in, final 
ComponentLog logger, final String dateFormat, final String timeFormat, final 
String timestampFormat)
               throws IOException, MalformedRecordException {
   
           this(logger, dateFormat, timeFormat, timestampFormat);
   
           try {
               jsonParser = jsonFactory.createParser(in);
               jsonParser.setCodec(codec);
   
               JsonToken token = jsonParser.nextToken();
               if (token == JsonToken.START_ARRAY) {
                   token = jsonParser.nextToken(); // advance to START_OBJECT 
token
               }
   
               if (token == JsonToken.START_OBJECT) { // could be END_ARRAY also
                   firstJsonNode = jsonParser.readValueAsTree();
               } else {
                   firstJsonNode = null;
               }
           } catch (final JsonParseException e) {
               throw new MalformedRecordException("Could not parse data as 
JSON", e);
           }
       }
   ```



##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/json/StartingFieldStrategy.java:
##########
@@ -16,23 +16,33 @@
  */
 package org.apache.nifi.json;
 
-public enum StartingFieldStrategy {
-    ROOT_NODE("Root Node", "Begins processing from the root node."),
-    NESTED_FIELD("Nested Field", "Skips forward to the given nested JSON field 
(array or object) to begin processing.");
+import org.apache.nifi.components.DescribedValue;
 
-    private final String displayName;
-    private final String description;
+public enum StartingFieldStrategy implements DescribedValue {

Review Comment:
   ```suggestion
   public enum StartingFieldStrategy implements DescribedValue {
       ROOT_NODE("Root Node", "Begins processing from the root node."),
       NESTED_FIELD("Nested Field", "Skips forward to the given nested JSON 
field (array or object) to begin processing.");
   
       private final String displayName;
       private final String description;
   
       StartingFieldStrategy(final String displayName, final String 
description) {
           this.displayName = displayName;
           this.description = description;
       }
   
       public String getDisplayName() {
           return displayName;
       }
   
       public String getDescription() {
           return description;
       }
   
       @Override
       public String getValue() {
           return name();
       }
   }
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to