[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448686#comment-16448686 ] ASF GitHub Bot commented on NIFI-4185: -- Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2587 > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Fix For: 1.7.0 > > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448684#comment-16448684 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2587 This is now merged to master. Thanks again for the contribution! > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Fix For: 1.7.0 > > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448682#comment-16448682 ] ASF subversion and git services commented on NIFI-4185: --- Commit d21bd3870b1edd107bc6ef12bc5f12820a10bebb in nifi's branch refs/heads/master from [~jope] [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=d21bd38 ] NIFI-4185: Add XML Record Reader > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448683#comment-16448683 ] ASF subversion and git services commented on NIFI-4185: --- Commit 0e736f59fdc29db471cecb4c7a885bf378e9ae71 in nifi's branch refs/heads/master from [~markap14] [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=0e736f5 ] NIFI-4185: Minor tweaks to XML Record Reader, around documentation and error handling This closes #2587. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448663#comment-16448663 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2587 @JohannesDaniel that would be great! I think the Record attribute stuff will significantly improve how we are able to handle XML-based records. But I think the approach that you've taken here is a great first step and will be very beneficial to the community. I'd recommend going ahead with the XML Writer and then we can work in the attribute-related stuff later. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448620#comment-16448620 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r183489279 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -84,6 +84,10 @@ public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, Str try { final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); --- End diff -- @tballison Thank you for the advice! I refactored this in a way that only the local part is considered. :) > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448612#comment-16448612 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 @markap14 thanks for the help with the patch file. I am fine with it :) If you have not planned to do that yourself, I would start implementing a XMLWriter as soon as this has been merged. Or should we first discuss your plan with the Record attributes you mentioned above? > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > Attachments: 0001-NIFI-4185-Minor-tweaks.patch > > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448494#comment-16448494 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2587 @JohannesDaniel this is great! I've been doing a good bit of testing to ensure that everything works as expected. I had just a few more comments, mostly around the descriptions in the property descriptor and the handling of invalid values for the "Expect Records as Array" property. To make it a little easier to understand, I'm just going to attach a PATCH file to the JIRA. Can you please look over the patch file? If you're okay with the patch file, then I'm ready to merge. Thanks again, you've put in a lot of work here to make this consistent with the current approaches and to provide a huge new capability for many users! > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448129#comment-16448129 ] ASF GitHub Bot commented on NIFI-4185: -- Github user tballison commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r183390424 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -84,6 +84,10 @@ public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, Str try { final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); --- End diff -- Might want to avoid XEE vulnerability via improved configuration of XMLInputFactory? https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Prevention_Cheat_Sheet#XMLInputFactory_.28a_StAX_parser.29 > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447249#comment-16447249 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 @markap14 - Added EL for record format property - Removed record tag validation > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446364#comment-16446364 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 @markap14 thank you for the response. I will simply remove that record tag validation as there are indeed many ways to do that before the data is processed by this reader. There is one little corner case, I need to discuss: Assuming we have the following data: ``` value1... ``` If the reader is used with (coerce==true), the field "map_field" can be parsed by defining a map in the schema. The embedded key fields do not have to be defined, its values only have to be of the defined type for the map. If the reader is used with (coerce==false && dropUnknown==true), the reader will parse all fields that exist in the schema ignoring its type. However, the data above will not be parsable even if the map exists in the schema. In this case, the reader identifies "map_field" as a field that exists in the schema, but the reader is not aware that it is of type map. Therefore, the reader will not parse the embedded key fields, as they don't exist in the schema. The field "map_field" will be classified as an empty field and not added to the record. Furthermore, even if the reader is used with (coerce==false && dropUnknown==true), it will be type-aware to some extent. The reader first checks, whether fields exist in the schema. If that is the case, the reader additionally will check whether they are of type record (or of type array embedding records, respectively). If that is also the case, the reader will retrieve the subschema in order to be enabled to check whether subtags of the current tag are known. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446291#comment-16446291 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2587 @JohannesDaniel thanks for the update! I commented above re: the use of Expression Language in the property descriptor. I do still feel like the check for 'record tag names' is unnecessary, as the reader should not be responsible for filtering the data but rather just for reading it. There already exist mechanisms for filtering the data (You could use PartitionRecord + RouteOnAttribute, ValidateRecord, or QueryRecord just off the top of my head to achieve this). Additionally, we have the Schema for the Record Reader. So if the element name matches the top-level Schema name (or one of them, if the top-level field is a UNION/CHOICE element), then we could use that. So, with your example above, if you only want to read the `` part, your schema should indicate that the top-level field name is `record`. In that case, it should filter out the `other` record. Does that make sense? > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446275#comment-16446275 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r183152900 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + --- End diff -- Yes, exactly. I would use a property to convey this. I would be okay with allowing Expression Language to be used, or just allowing for a 'true'/'false' without Expression Language (I think in most cases, you'll want one or the other, not dependent upon each individual FlowFile). But if you think EL is important then I won't argue that point :) One other option, which we do in a few different processors, would be to offer a third option that looks at a well-known attribute. So you could choose 'true' (treat outer element as a wrapper), 'false' (treat each flowfile as a single record), or 'use xml.stream attribute', and when that is selected, the 'xml.stream' attribute would be looked at to determine how to handle it - a value of 'true' would mean it's a stream of multiple records, 'false' would mean it's only 1 record, missing or any other value would throw an Exception. I don't have strong preference one way or another how this should be handled, but wanted to present options that we typically use. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444908#comment-16444908 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 @markap14 @pvillard31 - I refactored some code as the cases (coerce==true && drop==false) and (coerce==false && drop==true) in some cases showed an unexpected behavior - Data like contentcontentcontent now can be parsed - Maps (e. g. value1value2) are now supported - The reader is now able to parse single records (e. g. ) as well as arrays of records (e. g. ). I added a property to make it configurable whether the reader shall expect a single record or an array. One question: As there are only two options for this, I defined AllowableValues for this property. Despite that, I think it would be reasonable to enable EL for this property. But how can this be realized? - I removed the root validation, but remained the check for record tag names in order to support processing data like this (tag "other" will be ignored if check for record tag name is activated) > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439880#comment-16439880 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181851731 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag =
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439106#comment-16439106 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181653767 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag =
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436305#comment-16436305 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181221789 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() +.name("validate_root_tag") +.displayName("Validate Root Tag") +.description("If this property is set, the name of root tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, an exception is thrown. The treatment of such FlowFiles depends on the implementation " + +"of respective Processors.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor VALIDATE_RECORD_TAG = new PropertyDescriptor.Builder() --- End diff -- (non-record shall be skipped) > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436293#comment-16436293 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 @markap14 thank you for the comprehensive review. I will start refactoring the implementations with respect to the improvements that are clear. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436291#comment-16436291 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181218609 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag =
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436286#comment-16436286 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181218182 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag =
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436285#comment-16436285 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181217494 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag =
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436281#comment-16436281 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181215801 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); --- End diff -- ok, I will activate namespaces and implement some tests for this. > Add XML record reader & writer services >
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436279#comment-16436279 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181215337 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + --- End diff -- when I started implementing this reader, I was wondering, how the reader knows whether to parse wrapped records or a single record. unfortunately we dont have an unambiguous indicator like we have for json: [ vs. { I considered to make it configurable with EL whether the reader shall expect a single record or an array of records. what do you think? > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436269#comment-16436269 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181213403 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() +.name("validate_root_tag") +.displayName("Validate Root Tag") +.description("If this property is set, the name of root tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, an exception is thrown. The treatment of such FlowFiles depends on the implementation " + +"of respective Processors.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor VALIDATE_RECORD_TAG = new PropertyDescriptor.Builder() --- End diff -- my original intention actually was to enable users to parse recordsets like this ``` ... ... ... > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436271#comment-16436271 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r181213473 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() +.name("validate_root_tag") +.displayName("Validate Root Tag") +.description("If this property is set, the name of root tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, an exception is thrown. The treatment of such FlowFiles depends on the implementation " + +"of respective Processors.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor VALIDATE_RECORD_TAG = new PropertyDescriptor.Builder() +.name("validate_record_tag") +.displayName("Validate Record Tag") +.description("If this property is set, the name of record tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, the respective record will be skipped. If this property is not set, each level 2 starting tag will be treated " + +"as the beginning of a record.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor ATTRIBUTE_PREFIX = new PropertyDescriptor.Builder() +.name("attribute_prefix") +.displayName("Attribute Prefix") +.description("If this property is set, the name of attributes will be appended by a prefix when they are added to a record.") --- End diff -- ok > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi >
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428741#comment-16428741 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179838657 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428731#comment-16428731 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179831387 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428732#comment-16428732 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179830847 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428726#comment-16428726 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179819928 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() +.name("validate_root_tag") +.displayName("Validate Root Tag") +.description("If this property is set, the name of root tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, an exception is thrown. The treatment of such FlowFiles depends on the implementation " + +"of respective Processors.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor VALIDATE_RECORD_TAG = new PropertyDescriptor.Builder() --- End diff -- Likewise, I think we should remove this property and this sort of validation as well. If the user wants to validate some specific XML element names, the ValidateRecord processor is a great solution for that, and provides far more flexible validation via schema. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428745#comment-16428745 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179834679 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428735#comment-16428735 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179827839 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428736#comment-16428736 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179820052 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() +.name("validate_root_tag") +.displayName("Validate Root Tag") +.description("If this property is set, the name of root tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, an exception is thrown. The treatment of such FlowFiles depends on the implementation " + +"of respective Processors.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor VALIDATE_RECORD_TAG = new PropertyDescriptor.Builder() +.name("validate_record_tag") +.displayName("Validate Record Tag") +.description("If this property is set, the name of record tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, the respective record will be skipped. If this property is not set, each level 2 starting tag will be treated " + +"as the beginning of a record.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.expressionLanguageSupported(true) +.required(false) +.build(); + +public static final PropertyDescriptor ATTRIBUTE_PREFIX = new PropertyDescriptor.Builder() +.name("attribute_prefix") +.displayName("Attribute Prefix") +.description("If this property is set, the name of attributes will be appended by a prefix when they are added to a record.") --- End diff -- I think this is supposed to say "prepended with a prefix" > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL:
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428730#comment-16428730 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179829486 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428737#comment-16428737 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179833060 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428729#comment-16428729 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179829341 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428725#comment-16428725 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179820952 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + --- End diff -- I think the requirement that XML data must be wrapped in some sort of wrapper is going to be problematic. While this will be a fairly common case, so that multiple XML elements can be combined into a single FlowFile, it is also going to be common (probably more common) that each XML element will be its own standalone Record. This is especially important if this Reader is used for something like ListenTCPRecord or ConsumeKafkaRecord, where the data is received from elsewhere so no processor has a chance to wrap the content prior to using the XML Reader. I think we need to support both ignoring the outer-most element as well as incorporating the outer-most element. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428733#comment-16428733 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179830412 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428738#comment-16428738 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179833900 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428742#comment-16428742 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179835305 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428740#comment-16428740 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179829981 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428728#comment-16428728 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179827156 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428727#comment-16428727 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179824910 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); --- End diff -- I think this approach may lead to some odd behaviors if the incoming XML is actually namespace aware. For example, if
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428743#comment-16428743 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179834908 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428734#comment-16428734 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179829864 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428739#comment-16428739 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179831931 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? +private StartElement currentRecordStartTag; + +private final XMLEventReader xmlEventReader; + +private final Supplier LAZY_DATE_FORMAT; +private final Supplier LAZY_TIME_FORMAT; +private final Supplier LAZY_TIMESTAMP_FORMAT; + +public XMLRecordReader(InputStream in, RecordSchema schema, String rootName, String recordName, String attributePrefix, String contentFieldName, + final String dateFormat, final String timeFormat, final String timestampFormat, final ComponentLog logger) throws MalformedRecordException { +this.schema = schema; +this.recordName = recordName; +this.attributePrefix = attributePrefix; +this.contentFieldName = contentFieldName; +this.logger = logger; + +final DateFormat df = dateFormat == null ? null : DataTypeUtils.getDateFormat(dateFormat); +final DateFormat tf = timeFormat == null ? null : DataTypeUtils.getDateFormat(timeFormat); +final DateFormat tsf = timestampFormat == null ? null : DataTypeUtils.getDateFormat(timestampFormat); + +LAZY_DATE_FORMAT = () -> df; +LAZY_TIME_FORMAT = () -> tf; +LAZY_TIMESTAMP_FORMAT = () -> tsf; + +try { +final XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); + +// Avoid namespace replacements + xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); + +xmlEventReader = xmlInputFactory.createXMLEventReader(in); +final StartElement rootTag = getNextStartTag();
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428744#comment-16428744 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179840389 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + --- End diff -- More specifically, I think that if the following were the content of a FlowFile: ``` John Doe 123 01/01/2017 ``` Then I would expect to have this parse as a single Record that would match this schema: ``` { "name": "person", "namespace": "nifi", "type": "record", "fields": [ { "name": "name", "type": "string" }, { "name": "id", "type": "int" }, { "name": "dob", "type": "date" } ] } ``` Additionally, I would expect to be able to set a property that indicates that the outer-most XML element is simply a wrapper. If that property were set to "true", then I would expect to use that exact same schema to parse the following XML: ``` John Doe 123 01/01/2017 Jane Doe 124 01/01/2016 Jake Doe 125 01/01/2015 ``` In this case, the 'people' element is just a wrapper and could just as easily be an element named 'root' or 'foo' or 'bar'. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428724#comment-16428724 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179822292 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java --- @@ -0,0 +1,502 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ArrayDataType; +import org.apache.nifi.serialization.record.type.RecordDataType; +import org.apache.nifi.serialization.record.util.DataTypeUtils; + +import javax.xml.stream.XMLEventReader; +import javax.xml.stream.XMLInputFactory; +import javax.xml.stream.XMLStreamException; +import javax.xml.stream.events.Attribute; +import javax.xml.stream.events.Characters; +import javax.xml.stream.events.StartElement; +import javax.xml.stream.events.XMLEvent; +import java.io.IOException; +import java.io.InputStream; +import java.text.DateFormat; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.function.Supplier; + +public class XMLRecordReader implements RecordReader { + +private final ComponentLog logger; +private final RecordSchema schema; +private final String recordName; +private final String attributePrefix; +private final String contentFieldName; + +// thread safety required? --- End diff -- Record Readers don't need to be thread-safe, only the RecordReaderFactory does. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428723#comment-16428723 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r179819209 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() --- End diff -- I am actually in favor of removing this property all together. In order to properly read the records, the Record Readers will need to validate syntax of the data, but I don't believe that it should be validating arbitrary semantic meanings. I.e., I don't think that we should be checking the name of the outer-most element for any specific name. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425771#comment-16425771 ] ASF GitHub Bot commented on NIFI-4185: -- Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2587 Hey @JohannesDaniel - just made new tests and can't reproduce what I saw previously... so let's forget about it :) (thanks for the unit tests though!) I tested the EL support for tag validation, working well, thanks for the addition. I'd just make one suggestion: right now for the record tag, "in the case of a mismatch, the respective record will be skipped". I'd suggest to make this behavior configurable through a parameter in the CS: skip the record (current behavior) or throw an exception for the flow file (as you're doing for the root tag validation). Regarding the performance test, it was a GFF => ConvertRecord (XML to JSON) => UpdateAttribute. Don't remember the exact numbers but IIRC it was around 1300 flowfile per second with an XML payload of 6KB (single record in my case). Overall, it's really cool! I'll try to have a look over the code during the week. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421843#comment-16421843 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 Can you maybe post the XML that led to the empty record? > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421842#comment-16421842 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 Hi @pvillard31 thank you for your comments! I realized all your suggestions. I like your news regarding the performance :-) Which kind of transformation did you test? XML => Record or XML => JSON (e. g. with ConvertRecord)? For any reason some tests disappeared for a certain commit at my local git (probably, I wanted to reorder the tests, but deleted them, omg ...). However, I inserted them again (this is why there are many more tests now). In addition, I adjusted the definition about how namespaces shall be treated. I implemented several tests for XMLReader to verify that the usage of expression language works as expected. However, I was not able to reproduce your observation regarding the empty record for the header ``` ``` I implemented the following tests: ``` testSimpleRecordWithHeader() testSimpleRecordWithHeaderNoValidation() ``` Actually, they work as expected. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421837#comment-16421837 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178470638 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() --- End diff -- done. additionally, I added some tests for class XMLReader > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421838#comment-16421838 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178470648 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/resources/docs/org.apache.nifi.xml.XMLReader/additionalDetails.html --- @@ -0,0 +1,378 @@ + + + + + +XMLReader + + + + + +The XMLReader Controller Service reads XML content and creates Record objects. The Controller Service +must be configured with a schema that describes the structure of the XML data. Fields in the XML data +that are not defined in the schema will be skipped. + + +Records are expected in the second level of the XML data, embedded within an enclosing root tag: + + + +root + record +field1content/field1 +field2content/field2 + /record + record +field1content/field1 +field2content/field2 + /record +/root + + + + +For the following examples, it is assumed that the exemplary records are enclosed by a root tag. + + +Example 1: Simple Fields + + +The simplest kind of data within XML data are tags / fields only containing content (no attributes, no embedded tags). +They can be described in the schema by simple types (e. g. INT, STRING, ...). + + + + +record + simple_fieldcontent/simple_field +/record + + + + +This record can be described by a schema containing one field (e. g. of type string). By providing this schema, +the reader expects zero or one occurrences of "simple_field" in the record. + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "simple_field", "type": "string" } + ] +} + + + +Example 2: Arrays with Simple Fields + + +Arrays are considered as repetitive tags / fields in XML data. For the following XML data, "array_field" is considered +to be an array enclosing simple fields, whereas "simple_field" is considered to be a simple field not enclosed in +an array. + + + + +record + array_fieldcontent/array_field + array_fieldcontent/array_field + simple_fieldcontent/simple_field +/record + + + + +This record can be described by the following schema: + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "array_field", "type": + { "type": "array", "items": string } +}, +{ "name": "simple_field", "type": "string" } + ] +} + + + + +If a field in a schema is embedded in an array, the reader expects zero, one or more occurrences of the field +in a record. The field "array_field" principally also could be defined as a simple field, but then the second occurrence +of this field would replace the first in the record object. Moreover, the field "simple_field" could also be defined +as an array. In this case, the reader would put it into the record object as an array with one element. + + +Example 3: Tags with Attributes + + +XML fields frequently not only contain content, but also attributes. The following record contains a field with +an attribute "attr" and content: + + + + +record + field_with_attribute attr="attr_content"content of field/field_with_attribute +
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421836#comment-16421836 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178470625 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/resources/docs/org.apache.nifi.xml.XMLReader/additionalDetails.html --- @@ -0,0 +1,378 @@ + + + + + +XMLReader + + + + + +The XMLReader Controller Service reads XML content and creates Record objects. The Controller Service +must be configured with a schema that describes the structure of the XML data. Fields in the XML data +that are not defined in the schema will be skipped. + + +Records are expected in the second level of the XML data, embedded within an enclosing root tag: + + + +root + record +field1content/field1 +field2content/field2 + /record + record +field1content/field1 +field2content/field2 + /record +/root + + + + +For the following examples, it is assumed that the exemplary records are enclosed by a root tag. + + +Example 1: Simple Fields + + +The simplest kind of data within XML data are tags / fields only containing content (no attributes, no embedded tags). +They can be described in the schema by simple types (e. g. INT, STRING, ...). + + + + +record + simple_fieldcontent/simple_field +/record + + + + +This record can be described by a schema containing one field (e. g. of type string). By providing this schema, +the reader expects zero or one occurrences of "simple_field" in the record. + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "simple_field", "type": "string" } + ] +} + + + +Example 2: Arrays with Simple Fields + + +Arrays are considered as repetitive tags / fields in XML data. For the following XML data, "array_field" is considered +to be an array enclosing simple fields, whereas "simple_field" is considered to be a simple field not enclosed in +an array. + + + + +record + array_fieldcontent/array_field + array_fieldcontent/array_field + simple_fieldcontent/simple_field +/record + + + + +This record can be described by the following schema: + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "array_field", "type": + { "type": "array", "items": string } +}, +{ "name": "simple_field", "type": "string" } + ] +} + + + + +If a field in a schema is embedded in an array, the reader expects zero, one or more occurrences of the field +in a record. The field "array_field" principally also could be defined as a simple field, but then the second occurrence +of this field would replace the first in the record object. Moreover, the field "simple_field" could also be defined +as an array. In this case, the reader would put it into the record object as an array with one element. + + +Example 3: Tags with Attributes + + +XML fields frequently not only contain content, but also attributes. The following record contains a field with +an attribute "attr" and content: + + + + +record + field_with_attribute attr="attr_content"content of field/field_with_attribute +
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421441#comment-16421441 ] ASF GitHub Bot commented on NIFI-4185: -- Github user pvillard31 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178437056 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/resources/docs/org.apache.nifi.xml.XMLReader/additionalDetails.html --- @@ -0,0 +1,378 @@ + + + + + +XMLReader + + + + + +The XMLReader Controller Service reads XML content and creates Record objects. The Controller Service +must be configured with a schema that describes the structure of the XML data. Fields in the XML data +that are not defined in the schema will be skipped. + + +Records are expected in the second level of the XML data, embedded within an enclosing root tag: + + + +root + record +field1content/field1 +field2content/field2 + /record + record +field1content/field1 +field2content/field2 + /record +/root + + + + +For the following examples, it is assumed that the exemplary records are enclosed by a root tag. + + +Example 1: Simple Fields + + +The simplest kind of data within XML data are tags / fields only containing content (no attributes, no embedded tags). +They can be described in the schema by simple types (e. g. INT, STRING, ...). + + + + +record + simple_fieldcontent/simple_field +/record + + + + +This record can be described by a schema containing one field (e. g. of type string). By providing this schema, +the reader expects zero or one occurrences of "simple_field" in the record. + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "simple_field", "type": "string" } + ] +} + + + +Example 2: Arrays with Simple Fields + + +Arrays are considered as repetitive tags / fields in XML data. For the following XML data, "array_field" is considered +to be an array enclosing simple fields, whereas "simple_field" is considered to be a simple field not enclosed in +an array. + + + + +record + array_fieldcontent/array_field + array_fieldcontent/array_field + simple_fieldcontent/simple_field +/record + + + + +This record can be described by the following schema: + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "array_field", "type": + { "type": "array", "items": string } +}, +{ "name": "simple_field", "type": "string" } + ] +} + + + + +If a field in a schema is embedded in an array, the reader expects zero, one or more occurrences of the field +in a record. The field "array_field" principally also could be defined as a simple field, but then the second occurrence +of this field would replace the first in the record object. Moreover, the field "simple_field" could also be defined +as an array. In this case, the reader would put it into the record object as an array with one element. + + +Example 3: Tags with Attributes + + +XML fields frequently not only contain content, but also attributes. The following record contains a field with +an attribute "attr" and content: + + + + +record + field_with_attribute attr="attr_content"content of field/field_with_attribute +
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421442#comment-16421442 ] ASF GitHub Bot commented on NIFI-4185: -- Github user pvillard31 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178437600 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() +.name("validate_root_tag") +.displayName("Validate Root Tag") +.description("If this property is set, the name of root tags (e. g. ...) of incoming FlowFiles will be evaluated against this value. " + +"In the case of a mismatch, an exception is thrown. The treatment of such FlowFiles depends on the implementation " + +"of respective Processors.") +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +.required(false) +.build(); + +public static final PropertyDescriptor VALIDATE_RECORD_TAG = new PropertyDescriptor.Builder() --- End diff -- Same here (and for the next properties). > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421440#comment-16421440 ] ASF GitHub Bot commented on NIFI-4185: -- Github user pvillard31 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178437593 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.xml; + +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +@Tags({"xml", "record", "reader", "parser"}) +@CapabilityDescription("Reads XML content and creates Record objects. Records are expected in the second level of " + +"XML data, embedded in an enclosing root tag.") +public class XMLReader extends SchemaRegistryService implements RecordReaderFactory { + +public static final PropertyDescriptor VALIDATE_ROOT_TAG = new PropertyDescriptor.Builder() --- End diff -- Could this property supports expression language against incoming flow files? I don't think that's an easy change (and it could introduce a perf hit) but that would allow using the same reader for completely different XML inputs. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421443#comment-16421443 ] ASF GitHub Bot commented on NIFI-4185: -- Github user pvillard31 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2587#discussion_r178437093 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/resources/docs/org.apache.nifi.xml.XMLReader/additionalDetails.html --- @@ -0,0 +1,378 @@ + + + + + +XMLReader + + + + + +The XMLReader Controller Service reads XML content and creates Record objects. The Controller Service +must be configured with a schema that describes the structure of the XML data. Fields in the XML data +that are not defined in the schema will be skipped. + + +Records are expected in the second level of the XML data, embedded within an enclosing root tag: + + + +root + record +field1content/field1 +field2content/field2 + /record + record +field1content/field1 +field2content/field2 + /record +/root + + + + +For the following examples, it is assumed that the exemplary records are enclosed by a root tag. + + +Example 1: Simple Fields + + +The simplest kind of data within XML data are tags / fields only containing content (no attributes, no embedded tags). +They can be described in the schema by simple types (e. g. INT, STRING, ...). + + + + +record + simple_fieldcontent/simple_field +/record + + + + +This record can be described by a schema containing one field (e. g. of type string). By providing this schema, +the reader expects zero or one occurrences of "simple_field" in the record. + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "simple_field", "type": "string" } + ] +} + + + +Example 2: Arrays with Simple Fields + + +Arrays are considered as repetitive tags / fields in XML data. For the following XML data, "array_field" is considered +to be an array enclosing simple fields, whereas "simple_field" is considered to be a simple field not enclosed in +an array. + + + + +record + array_fieldcontent/array_field + array_fieldcontent/array_field + simple_fieldcontent/simple_field +/record + + + + +This record can be described by the following schema: + + + + +{ + "namespace": "nifi", + "name": "test", + "type": "record", + "fields": [ +{ "name": "array_field", "type": + { "type": "array", "items": string } +}, +{ "name": "simple_field", "type": "string" } + ] +} + + + + +If a field in a schema is embedded in an array, the reader expects zero, one or more occurrences of the field +in a record. The field "array_field" principally also could be defined as a simple field, but then the second occurrence +of this field would replace the first in the record object. Moreover, the field "simple_field" could also be defined +as an array. In this case, the reader would put it into the record object as an array with one element. + + +Example 3: Tags with Attributes + + +XML fields frequently not only contain content, but also attributes. The following record contains a field with +an attribute "attr" and content: + + + + +record + field_with_attribute attr="attr_content"content of field/field_with_attribute +
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419507#comment-16419507 ] ASF GitHub Bot commented on NIFI-4185: -- Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2587 Thanks for pinging me @JohannesDaniel - I'll also try to have a look over the WE / next week. It'd be a great addition! > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418054#comment-16418054 ] ASF GitHub Bot commented on NIFI-4185: -- Github user markap14 commented on the issue: https://github.com/apache/nifi/pull/2587 @JohannesDaniel thanks for the contribution! This is definitely something that I've been wanting to implement for a while but haven't had a chance yet. Will be happy to review. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416274#comment-16416274 ] ASF GitHub Bot commented on NIFI-4185: -- Github user JohannesDaniel commented on the issue: https://github.com/apache/nifi/pull/2587 @pvillard31 here we go :) > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416226#comment-16416226 ] ASF GitHub Bot commented on NIFI-4185: -- GitHub user JohannesDaniel opened a pull request: https://github.com/apache/nifi/pull/2587 NIFI-4185 Add XML Record Reader Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/JohannesDaniel/nifi NIFI-4185 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2587.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2587 commit 9b4bd0dd8f1d30bfe1597d4cd069df414eb968a0 Author: JohannesDanielDate: 2018-03-06T23:02:43Z Add XML Record Reader > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403399#comment-16403399 ] Johannes Peter commented on NIFI-4185: -- Hi [~pvillard], for this reader I have not planned to require an XSD schema. My intention is that it can be configured in the same way like readers of other formats. I therefore translate Avro definitions to XML structures that are expected by the reader. Generally, the reader expects an array containing zero, one or more records. I use StAX as its pulling logic suits well to the record-lookup requirement. BTW: Do you have an idea which XML structure the reader could expect when users define a map in their schema? Maybe something like this? {code} content content or object ... {code} > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402528#comment-16402528 ] Pierre Villard commented on NIFI-4185: -- Hi [~jope], I'm really interested by this work and will be happy to have a look when you submit a PR. A quick question: are you going to require an XSD schema in your reader? or will you infer the schema by reading the input XML data? (XSD would be cool because it allows much more specifications than an Avro schema) I'm asking because one of the issue if you don't have an XSD is to make a difference between an array of 1 record (that could be translated into a single record) and an array of multiple records. Example: I want the two below inputs content and content content to be converted with the same output schema having an array of records. Does it make sense? > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Assignee: Johannes Peter >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394487#comment-16394487 ] Johannes Peter commented on NIFI-4185: -- [~alopresto]: Started implementing an XML Record Reader. Shall I create a separate ticket for this? Similar to the JSON readers, the XML reader will expect either a single record (e. g. content ... ) or an array of records (e. g. content ... ... ) The reader will be aligned with common transformators. "Normal" fields (e. g. String, Integer) can be described by simple key-value pairs: XML definition content 123 Schema definition { "name": "testschema", "namespace": "nifi", "type": "record", "fields": [ { "name": "field1", "type": "string" }, { "name": "field2", "type": "int" } ] } Parsing of attributes or nested fields require the definition of nested records and a field name for the content (optional, a prefix for attributes can be defined): Property: CONTENT_FIELD=content_field Property: ATTRIBUTE_PREFIX=attr. XML definition some text some nested text some other nested text Schema definition { "name": "testschema", "namespace": "nifi", "type": "record", "fields": [ { "name": "field1", "type": { "name": "NestedRecord", "type": "record", "fields" : [ {"name": "attr.attribute", "type": "string"}, {"name": "content_field", "type": "string"} ] } }, { "name": "field2", "type": { "name": "NestedRecord", "type": "record", "fields" : [ {"name": "attr.attribute", "type": "string"}, {"name": "nested1", "type": "string"}, {"name": "nested2", "type": "string"} ] } } ] } What do you say? > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto >Priority: Major > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4185) Add XML record reader & writer services
[ https://issues.apache.org/jira/browse/NIFI-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088201#comment-16088201 ] Andy LoPresto commented on NIFI-4185: - With the new record model, the prior approach is OBE. > Add XML record reader & writer services > --- > > Key: NIFI-4185 > URL: https://issues.apache.org/jira/browse/NIFI-4185 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Andy LoPresto > Labels: json, records, xml > > With the addition of the {{RecordReader}} and {{RecordSetWriter}} paradigm, > XML conversion has not yet been targeted. This will replace the previous > ticket for XML to JSON conversion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)