[ 
https://issues.apache.org/jira/browse/NIFI-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886331#comment-17886331
 ] 

Sandro Berger commented on NIFI-7697:
-------------------------------------

We are running Nifi 1.23.3 with XMLReader 1.23.2 and the problem with empty XML 
structures is still present.

We are also trying to convert XML to json, but the empty XML tag is lost if 
there is not an adjacent tag with data .
Example:
{code:xml}
<Event>
   <System>
      <EventRecordID>3818553431</EventRecordID>
      <Correlation/>
      <Event>
        <ID>4664</ID>
        <Level/>
      </Event>
   </System>
   <System>
      <EventRecordID>3818553432</EventRecordID>
      <Correlation/>
      <Event>
        <ID/>
        <Level/>
      </Event>
   </System>
</Event>{code}
ConvertRecord with XMLReader and JsonRecordSetWriter results in:
{code:json}
[ {
  "System" : [ {
    "EventRecordID" : 3818553431,
    "Correlation" : null,
    "Execution" : null,
    "Security" : null,
    "Event" : {
      "ID" : 4664,
      "Level" : null
    }
  }, {
    "EventRecordID" : 3818553432,
    "Correlation" : null,
    "Execution" : null,
    "Security" : null,
    "Event" : null
  } ]
} ]{code}
The {{ID}} and {{Level}} tags are lost in the second record.

We would like to use JOLT to modify the data and then convert back to XML. 
Therefore we need all these tags in JSON as null values to get the empty XML 
tags again when converting from JSON to XML.

> NiFi XMLReader Record Component sometimes ignores empty XML Elements
> --------------------------------------------------------------------
>
>                 Key: NIFI-7697
>                 URL: https://issues.apache.org/jira/browse/NIFI-7697
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.11.4
>         Environment: Windows 10
>            Reporter: Andrew Chafos
>            Priority: Major
>              Labels: ControllerService, Processor, Record
>
> I am currently developing a processor for Apache NiFi that is contingent upon 
> being configured with an implementation of RecordReaderFactory that produces 
> well-formed NiFi Records based on input data.
> The JsonTreeReader component produced accurate results for all of my test 
> cases.  However, I noticed that, at least with the default configuration, the 
> XMLReader component sometimes seems to mishandle data; namely, empty XML 
> elements that are sub-children of XML elements that are represented as Arrays 
> in NiFi Records.
> This occurs when I test using the standard ConvertRecord NiFi Processor and 
> set the Reader to XMLReader and the Writer to JsonRecordSetWriter.
> These first 2 test cases work as expected:
> *Test Case 1:*
> Input XML:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <Root>
>    <DataArr>SomeData</DataArr>
>    <DataArr>
>       <Field>
>          <NonEmptyField>2</NonEmptyField>
>       </Field>
>    </DataArr>
> </Root>
> {code}
> Output Json:
> {code:json}
> [
>    {
>       "DataArr":[
>          "SomeData",
>          "MapRecord[{Field=MapRecord[{NonEmptyField=2}]}]"
>       ]
>    }
> ]
> {code}
> *Test Case 2:*
> Input XML:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <Root>
>    <SomeData />
>    <MoreData>2</MoreData>
> </Root>
> {code}
> Output Json:
> {code:json}
> [
>    {
>       "SomeData":null,
>       "MoreData":2
>    }
> ]
> {code}
> However, the following does *not* work as expected:
> *Test Case 3:*
> Input XML:
> {code:xml}
> <Root>
>    <DataArr>SomeData</DataArr>
>    <DataArr>
>       <Field>
>          <EmptyField/>
>       </Field>
>    </DataArr>
> </Root>
> {code}
> Output Json:
> {code:json}
> [
>    {
>       "DataArr":[
>          "SomeData"
>       ]
>    }
> ]
> {code}
> It is critical for the functioning of my Processor that Field and EmptyField 
> appear in this Json output for Test Case 3, and for all other inputs 
> analogous to this case.
> I have tried to supply a custom NiFi RecordSchema to the components and 
> verified it was being used, but I got the same results.
> Is there a way to configure these controllers such that this empty field is 
> not ignored, or is this a bug in the XMLReader component?
> You can get these results from running this processor as described on NiFi, 
> but you can also run this JUnit test with testXml swapped out with the 
> particular test case:
> {code:java}
> import org.apache.nifi.controller.ControllerService;
> import org.apache.nifi.json.JsonRecordSetWriter;
> import org.apache.nifi.processor.Relationship;
> import org.apache.nifi.processors.standard.ConvertRecord;
> import org.apache.nifi.reporting.InitializationException;
> import org.apache.nifi.util.MockFlowFile;
> import org.apache.nifi.util.TestRunner;
> import org.apache.nifi.util.TestRunners;
> import org.apache.nifi.xml.XMLReader;
> import org.junit.Test;
> public class TestNiFiMinimal {
>     @Test
>     public void testEmptyXMLGetsProcessed() throws InitializationException {
>         ConvertRecord convertRecord = new ConvertRecord();
>         TestRunner testRunner = TestRunners.newTestRunner(convertRecord);
>         ControllerService xmlReader = new XMLReader();
>         testRunner.addControllerService("xmlReader", xmlReader);
>         testRunner.enableControllerService(xmlReader);
>         testRunner.setProperty("record-reader", "xmlReader");
>         ControllerService jsonWriter = new JsonRecordSetWriter();
>         testRunner.addControllerService("jsonWriter", jsonWriter);
>         testRunner.enableControllerService(jsonWriter);
>         testRunner.setProperty("record-writer", "jsonWriter");
>         String testXml = "<?xml version='1.0' 
> encoding='UTF-8'?><Root><DataArr>SomeData</DataArr><DataArr><Field><EmptyField/></Field></DataArr></Root>";
>         testRunner.enqueue(testXml);
>         testRunner.run();
>         Relationship success = 
> convertRecord.getRelationships().stream().filter(relationship -> 
> relationship.getName().equals("success")).findAny().get();
>         testRunner.assertAllFlowFilesTransferred(success);
>         final MockFlowFile original = 
> testRunner.getFlowFilesForRelationship(success).get(0);
>         original.assertContentEquals("");
>     }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to