Stephen Jeffrey Hindmarch created NIFI-13334:
------------------------------------------------

             Summary: XMLReader drops name-value content tags from record 
arrays if one record has only one tag
                 Key: NIFI-13334
                 URL: https://issues.apache.org/jira/browse/NIFI-13334
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core UI
    Affects Versions: 1.24.0
         Environment: Docker
            Reporter: Stephen Jeffrey Hindmarch


If you create an XMLReader service and set the following:
 * Parse XML Attributes = true
 * Expect Records as Arrays = true
 * Field Name for Content = Value

Then use the reader in a ConvertRecord processor with a JSONRecordSetWriter

When parsing a flow file such as
{noformat}
<Events>
  <Event Type="foo">
    <UserData>
      <Data Name="Param1">String1</Data>
      <Data Name="Param2">String2</Data>
    </UserData>
  </Event>
  <Event Type="bar">
    <UserData>
      <Data Name="Param1">String</Data>
      <Data Name="Param2">String2</Data>
      <Data Name="Param3">String3</Data>
    </UserData>
  </Event>
</Events>{noformat}
Then as expected the content tags are parsed into arrays
{noformat}
[ {
  "Type" : "foo",
  "UserData" : {
    "Data" : [ {
      "Name" : "Param1",
      "Value" : "String1"
    }, {
      "Name" : "Param2",
      "Value" : "String2"
    } ]
  }
}, {
  "Type" : "bar",
  "UserData" : {
    "Data" : [ {
      "Name" : "Param1",
      "Value" : "String"
    }, {
      "Name" : "Param2",
      "Value" : "String2"
    }, {
      "Name" : "Param3",
      "Value" : "String3"
    } ]
  }
} ]{noformat}
But if one of the records has only one data tag, then it will not be presented 
in an array, and more importantly, nor will the tags for the other record. 
Instead, all but the last tags are dropped.

For example
{noformat}
<Events>
  <Event Type="foo">
    <UserData>
      <Data Name="Param1">String1</Data>
    </UserData>
  </Event>
  <Event Type="bar">
    <UserData>
      <Data Name="Param1">String</Data>
      <Data Name="Param2">String2</Data>
      <Data Name="Param3">String3</Data>
    </UserData>
  </Event>
</Events>{noformat}
parses to
{noformat}
[ {
  "Type" : "foo",
  "UserData" : {
    "Data" : {
      "Name" : "Param1",
      "Value" : "String1"
    }
  }
}, {
  "Type" : "bar",
  "UserData" : {
    "Data" : {
      "Name" : "Param3",
      "Value" : "String3"
    }
  }
} ]{noformat}
Note that the second event has lost all but the last of its data content tags.

It does not matter which event (first or second) has 1 tag, the other event 
loses content.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to