Avro supplies the alias keyword.
So for the example below the following schema works for the namespace
{
"type":"record",
"name":"example",
"fields": [
{"name":"attr1","type":"int" },
{"name":"attr2","type":"int" },
{"name":"attr3","type":"int" },
{
"name":"third",
"type": {"type":"record","name":"thirdType","fields": [
{"name":"att1","type":"string","aliases":
["{urn:us:gov:ic:ism:v2}att1" ] },
{"name":"att2","type":"string","aliases":
["{urn:us:gov:ic:ism:v2}att2" ] },
{"name":"att3","type":"string","aliases":
["{urn:us:gov:ic:ism:v2}att3" ] }
] }
}
]
}
On 9/19/22 17:07, Andrew McDonald wrote:
Somehow the formatting got squished for my `third` level
<root xmlns:ICISM="urn:us:gov:ic:ism:v2" >
<data attr1="val" attr2="val" attr3="val">
<third ICISM:att1="cannot_get_val" ICISM:att2="cannot_get_val"
ICISM:att3="cannot_get_val">
</data>
<root>
And sorry D. Palmatier, I see you wrote 3rd level but meant 4th level
by the example you provided. And I don't know if 4th level is possible.
Regards, Andrew
On 9/19/22 16:56, Andrew McDonald wrote:
Yes, you can get the 3rd level fields, at least with 1.12.1 I have
been able to.
The TestXMLReader uses:
https://github.com/apache/nifi/blob/rel/nifi-1.12.1/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/resources/xml/people.xml
With the schema,
https://github.com/apache/nifi/blob/rel/nifi-1.12.1/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/resources/xml/testschema
What I've just found out, like minutes ago, and would like some help
is how do deal with name spaced attributes on the 3rd level.
For my situation
<root xmlns:ICISM="urn:us:gov:ic:ism:v2" >
<data attr1="val" attr2="val" attr3="val">
<third ICISM:att1="cannot_get_val"
ICISM:att2="cannot_get_val" ICISM:att3="cannot_get_val">
</data>
<root>
The namespaced decorated attribute in the third tag is not being
populated. In my test xml, if I remove the namespacing from
att{1,2,3) then the Json data is populated.
I do see a people_namespace.xml that is used in the
TestXMLRecordReader but that is only for tags.
I'm hoping there is a patch I could apply to 1.12.1 b/c we are bound
to this version for a while.
Regards, Andrew
On 8/31/22 12:33, D. Palmatier wrote:
Hello.
I'm trying to query the records within a large, ~15GB, XML file. The
format of the file is:
<xmlfeed version="1" generated="2022-08-11 13:00:00">
<records>
<record>
<field1></field1>
<field2></field2>
</record>
<record>
<field1></field1>
<field2></field2>
</record>
</records>
</xmlfeed>
Unfortunately the records I want to query are at the third level and
the XMLReader expects records at the second level.
I don't have any control over the format of the source file. Is
there a way I can get to these inner records for my queries without
having to load the entire file?
Thank you for your time.
David