The problem is with the XML Reader converting xml tags and field contents
at the same time.
I tried to use the example: Tags with Attributes from the XML Reader
additional details help page:
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.7.0/org.apache.nifi.xml.XMLReader/additionalDetails.html
In this example I don't get the data from the xml tag field only the
content value. In my own dataset I do get the xml tag data but not the
content value.

GenerateFlowfile -> ConvertRecord (XMLReader->JSONSetWriter)

A GenerateFlowfile process with this text:
    <records>
        <record>
            <field_with_attribute attr="attr_content">content of
field</field_with_attribute>
        </record>
    </records>

XMLReader Settings:
Schema Access Strategy: Use 'Schema Text' Property
Schema Text:
    {   "name": "test",
        "namespace": "nifi",
        "type": "record",
        "fields": [
            {   "name": "field_with_attribute",
                "type":
                {   "name": "RecordForTag",
                    "type": "record",
                    "fields" : [
                        { "name": "attr", "type": "string"},
                        {"name": "field_name_for_content", "type": "string"}
                      ]
                  }
              }
          ]
      }
Expect Records as Array: true
Attribute Prefix:        prefix_
Field Name for Content:  field_name_for_content

The JSON Output is the following:
{
  "field_with_attribute" : {
    "attr" : null,
    "field_name_for_content" : "content of field"
  }
}

Kind Regards
Jens M. Kofoed

Den ons. 28. jul. 2021 kl. 19.06 skrev Etienne Jouvin <
lapinoujou...@gmail.com>:

> Hello.
>
> What you can do, is to write the expected JSON.
> Then do a process with convert record, with a JSON tree reader and a
> writer. On the writer, specify that you want to write the schema in
> property.
> Like this, yo uwil have the wanted schema.
>
> Etienne
>
>
> Le mer. 28 juil. 2021 à 14:51, Jens M. Kofoed <jmkofoed....@gmail.com> a
> écrit :
>
>> If I use the following schema:
>> { "name":"object","type":["null",{
>> "name":"objectRecord","type":"record","fields":[{ "name":
>> objectDetails","type": {"name": "objectDetails","type": "record","fields":
>> [{ "name": additionalInfo","type": {"type": "array","items": {"name":
>> "additionalInfo","type": "record","fields": [{"name": "name","type":
>> "string"},{"name":
>> "value","type":["null","string","int","boolean"]}]}}}]}},{
>> name":"objectIdentification","type":["null",{
>> "name":"objectIdentificationRecord","type":"record","fields":[{"name":"objectId","type":["null","int"]},{"name":"objectType","type":["null","string"]}]}]}]}]}
>>
>> Is is almost there. All I'm missing is the value for the tagX fields.
>>
>> Please help
>>
>> regards
>> Jens M. Kofoed
>>
>

Reply via email to