The problem is with the XML Reader converting xml tags and field contents
at the same time.
I tried to use the example: Tags with Attributes from the XML Reader
additional details help page:
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.7.0/org.apache.nifi.xml.XMLReader/additionalDetails.html
In this example I don't get the data from the xml tag field only the
content value. In my own dataset I do get the xml tag data but not the
content value.
GenerateFlowfile -> ConvertRecord (XMLReader->JSONSetWriter)
A GenerateFlowfile process with this text:
<records>
<record>
<field_with_attribute attr="attr_content">content of
field</field_with_attribute>
</record>
</records>
XMLReader Settings:
Schema Access Strategy: Use 'Schema Text' Property
Schema Text:
{ "name": "test",
"namespace": "nifi",
"type": "record",
"fields": [
{ "name": "field_with_attribute",
"type":
{ "name": "RecordForTag",
"type": "record",
"fields" : [
{ "name": "attr", "type": "string"},
{"name": "field_name_for_content", "type": "string"}
]
}
}
]
}
Expect Records as Array: true
Attribute Prefix: prefix_
Field Name for Content: field_name_for_content
The JSON Output is the following:
{
"field_with_attribute" : {
"attr" : null,
"field_name_for_content" : "content of field"
}
}
Kind Regards
Jens M. Kofoed
Den ons. 28. jul. 2021 kl. 19.06 skrev Etienne Jouvin <
[email protected]>:
> Hello.
>
> What you can do, is to write the expected JSON.
> Then do a process with convert record, with a JSON tree reader and a
> writer. On the writer, specify that you want to write the schema in
> property.
> Like this, yo uwil have the wanted schema.
>
> Etienne
>
>
> Le mer. 28 juil. 2021 à 14:51, Jens M. Kofoed <[email protected]> a
> écrit :
>
>> If I use the following schema:
>> { "name":"object","type":["null",{
>> "name":"objectRecord","type":"record","fields":[{ "name":
>> objectDetails","type": {"name": "objectDetails","type": "record","fields":
>> [{ "name": additionalInfo","type": {"type": "array","items": {"name":
>> "additionalInfo","type": "record","fields": [{"name": "name","type":
>> "string"},{"name":
>> "value","type":["null","string","int","boolean"]}]}}}]}},{
>> name":"objectIdentification","type":["null",{
>> "name":"objectIdentificationRecord","type":"record","fields":[{"name":"objectId","type":["null","int"]},{"name":"objectType","type":["null","string"]}]}]}]}]}
>>
>> Is is almost there. All I'm missing is the value for the tagX fields.
>>
>> Please help
>>
>> regards
>> Jens M. Kofoed
>>
>