[ 
https://issues.apache.org/jira/browse/NIFI-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826255#comment-16826255
 ] 

Matt Burgess commented on NIFI-6241:
------------------------------------

I believe the need for a root tag is because the record-based processors are 
meant to work on flow files containing multiple records. Currently for the 
XMLReader it expects a root tag even if there is only one record in the flow 
file. Perhaps it is possible to relax this requirement if there is only one 
record.

For the "missing" fields, to me it looks like no fields were inferred because 
there are no fields with explicit values within, only self-closing tags with 
attributes. I think that's expected behavior until we revamp the schema system 
to support formats that have metadata about the fields themselves (XML tag 
attributes, e.g.). What fields/values were you expecting? Perhaps we could add 
a property to extract attributes as fields or something.

> ConvertRecord Schema Inference fails to infer complete schema, or simply fails
> ------------------------------------------------------------------------------
>
>                 Key: NIFI-6241
>                 URL: https://issues.apache.org/jira/browse/NIFI-6241
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.9.2
>            Reporter: David Sargrad
>            Priority: Major
>         Attachments: Reproduce_ConvertRecord_Shortcoming.xml, 
> image-2019-04-24-13-38-16-605.png, image-2019-04-24-13-39-36-327.png, 
> image-2019-04-24-13-41-00-704.png, image-2019-04-24-13-41-26-860.png, 
> image-2019-04-24-13-43-28-531.png, image-2019-04-24-13-43-59-706.png, 
> image-2019-04-24-17-03-10-728.png, image-2019-04-25-09-13-52-416.png, 
> image-2019-04-25-09-19-15-406.png, image-2019-04-25-09-30-08-297.png
>
>
> I've got a simple test flow as depicted below:
>  
>  
> !image-2019-04-24-13-38-16-605.png!
>  
> The input XML is:
> !image-2019-04-24-13-41-26-860.png!
>  
> The output JSON is almost correct, yet it is missing two critical fields 
> (they both show up as "null". The null fields are 
> {color:#ff0000}position{color} and {color:#ff0000}ncsmTrackData{color}. It is 
> also missing all of the attributes on fltdMessage.
>  
> !image-2019-04-24-13-41-00-704.png!
>  
> The configuration of my ConvertRecord is:
> !image-2019-04-24-13-43-28-531.png!
>  
> My XMLReader configuration is:
> !image-2019-04-24-13-43-59-706.png!
>  
>  Questions:
>  # Why are these two fields null? 
>  # Why are all the fltdMessage attributes being ignored?
> It would seem that this is a bug, or at least a major shortcoming, in the 
> schema inference capability. If there were a way for me to view the inferred 
> schema, then I could use that as a starting point. However its not clear from 
> the documentation how to view that schema.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to