[ 
https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508944#comment-16508944
 ] 

ASF GitHub Bot commented on NIFI-5213:
--------------------------------------

Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2718#discussion_r194575493
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java
 ---
    @@ -0,0 +1,62 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.avro;
    +
    +import org.apache.avro.Schema;
    +import org.apache.nifi.serialization.SimpleRecordSchema;
    +import org.apache.nifi.serialization.record.RecordSchema;
    +import org.junit.Test;
    +
    +import java.io.File;
    +import java.io.FileInputStream;
    +import java.io.IOException;
    +
    +public class TestAvroReaderWithExplicitSchema {
    +
    +    @Test
    +    public void testAvroExplicitReaderWithSchemalessFile() throws 
Exception {
    +        File avroFileWithEmbeddedSchema = new 
File("src/test/resources/avro/avro_schemaless.avro");
    +        FileInputStream fileInputStream = new 
FileInputStream(avroFileWithEmbeddedSchema);
    +        Schema dataSchema = new Schema.Parser().parse(new 
File("src/test/resources/avro/avro_schemaless.avsc"));
    +        RecordSchema recordSchema = new 
SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, 
null);
    +
    +        AvroReaderWithExplicitSchema avroReader = new 
AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema);
    +        avroReader.nextAvroRecord();
    --- End diff --
    
    As-is, this wouldn't test anything that I can tell with that method because 
all of the errors are caught in that method. Should have some assertions around 
it IMO.


> Allow AvroReader with explicit schema to read files with embedded schema
> ------------------------------------------------------------------------
>
>                 Key: NIFI-5213
>                 URL: https://issues.apache.org/jira/browse/NIFI-5213
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>            Priority: Minor
>
> AvroReader allows the choice of schema access strategy from such options as 
> Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming 
> Avro files will have embedded schemas, then Use Embedded Schema is best 
> practice for the Avro Reader. However it is not intuitive that if the same 
> schema that is embedded in the file is specified by name (using a schema 
> registry) or explicitly via Schema Text, that errors can occur. This has been 
> noticed in QueryRecord for example, and the error is also not intuitive or 
> descriptive (it is often an ArrayIndexOutOfBoundsException).
> To provide a better user experience, it would be an improvement for 
> AvroReader to be able to successfully process Avro files with embedded 
> schemas, even when the Schema Access Strategy is not "Use Embedded Schema". 
> Of course, the explicit schema would have to match the embedded schema, or an 
> error would be reported (and rightfully so).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to