[
https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510183#comment-16510183
]
ASF GitHub Bot commented on NIFI-5213:
--------------------------------------
Github user mattyb149 commented on a diff in the pull request:
https://github.com/apache/nifi/pull/2718#discussion_r194883172
--- Diff:
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.avro;
+
+import org.apache.avro.Schema;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.junit.Test;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+
+public class TestAvroReaderWithExplicitSchema {
+
+ @Test
+ public void testAvroExplicitReaderWithSchemalessFile() throws
Exception {
+ File avroFileWithEmbeddedSchema = new
File("src/test/resources/avro/avro_schemaless.avro");
+ FileInputStream fileInputStream = new
FileInputStream(avroFileWithEmbeddedSchema);
+ Schema dataSchema = new Schema.Parser().parse(new
File("src/test/resources/avro/avro_schemaless.avsc"));
+ RecordSchema recordSchema = new
SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT,
null);
+
+ AvroReaderWithExplicitSchema avroReader = new
AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema);
+ avroReader.nextAvroRecord();
+ }
+
+ @Test
+ public void testAvroExplicitReaderWithEmbeddedSchemaFile() throws
Exception {
+ File avroFileWithEmbeddedSchema = new
File("src/test/resources/avro/avro_embed_schema.avro");
+ FileInputStream fileInputStream = new
FileInputStream(avroFileWithEmbeddedSchema);
+ Schema dataSchema = new Schema.Parser().parse(new
File("src/test/resources/avro/avro_schemaless.avsc"));
+ RecordSchema recordSchema = new
SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT,
null);
+
+ AvroReaderWithExplicitSchema avroReader = new
AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema);
+ avroReader.nextAvroRecord();
+ }
+
+ @Test(expected = IOException.class)
--- End diff --
No from the constructor. I updated the test to make that more clear, thanks!
> Allow AvroReader with explicit schema to read files with embedded schema
> ------------------------------------------------------------------------
>
> Key: NIFI-5213
> URL: https://issues.apache.org/jira/browse/NIFI-5213
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Matt Burgess
> Assignee: Matt Burgess
> Priority: Minor
>
> AvroReader allows the choice of schema access strategy from such options as
> Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming
> Avro files will have embedded schemas, then Use Embedded Schema is best
> practice for the Avro Reader. However it is not intuitive that if the same
> schema that is embedded in the file is specified by name (using a schema
> registry) or explicitly via Schema Text, that errors can occur. This has been
> noticed in QueryRecord for example, and the error is also not intuitive or
> descriptive (it is often an ArrayIndexOutOfBoundsException).
> To provide a better user experience, it would be an improvement for
> AvroReader to be able to successfully process Avro files with embedded
> schemas, even when the Schema Access Strategy is not "Use Embedded Schema".
> Of course, the explicit schema would have to match the embedded schema, or an
> error would be reported (and rightfully so).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)