Josh Highley created AVRO-3202:
----------------------------------

             Summary: Avro Reader exception when data field not present, and 
schema has alias
                 Key: AVRO-3202
                 URL: https://issues.apache.org/jira/browse/AVRO-3202
             Project: Apache Avro
          Issue Type: Bug
         Environment: Nifi 1.12.2   Avro 1.10.2
            Reporter: Josh Highley


I believe a change in Avro 1.10 has caused a bug in Nifi's Avro reader when a 
field in the schema is not in the data, and the schema field has an alias. When 
the reader is searching for data fields in the FlowFile, it first tries the 
schema's field name and if not found then tries any aliases. The issue occurs 
when the field is not in the data. Prior to Avro 1.10.0, 
org/apache/avro/generic/GenericData$Record.get(String key) would just return 
null if the field name wasn't found in the schema. In 1.10, this was changed to 
throw an exception. It occurs when trying to get data using the alias name 
after first trying the field name (data does not contain a field with either 
name).

In my example, this is the schema field: 
{noformat}
{"name":"phone_number","type":["null","string"],"default":null,"aliases":["PhoneNumber"]}{noformat}
stack trace (note field name in error message is the alias name 'PhoneNumber', 
not field name 'phone_number'

 
{noformat}
org.apache.nifi.serialization.MalformedRecordException: Error while getting 
next record. Root cause: org.apache.avro.AvroRuntimeException: Not a valid 
schema field: PhoneNumber
         at 
org.apache.nifi.avro.AvroRecordReader.nextRecord(AvroRecordReader.java:52)
         at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:498)
         at 
org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:254)
         at 
org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.access$100(StandardControllerServiceInvocationHandler.java:38)
         at 
org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler$ProxiedReturnObjectInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:240)
         at com.sun.proxy.$Proxy231.nextRecord(Unknown Source)
         at org.apache.nifi.serialization.RecordReader$nextRecord.call(Unknown 
Source)
         at Script4c2db96c.run(Script4c2db96c.groovy:19)
         at 
org.apache.nifi.processors.groovyx.ExecuteGroovyScript.onTrigger(ExecuteGroovyScript.java:472)
         at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
         at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174)
         at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213)
         at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
         at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
         at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
         at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
         at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
         at java.lang.Thread.run(Thread.java:748)
 Caused by: org.apache.avro.AvroRuntimeException: Not a valid schema field: 
PhoneNumber
         at org.apache.avro.generic.GenericData$Record.get(GenericData.java:256)
         at 
org.apache.nifi.avro.AvroTypeUtil.convertAvroRecordToMap(AvroTypeUtil.java:846)
         at 
org.apache.nifi.avro.AvroTypeUtil.convertAvroRecordToMap(AvroTypeUtil.java:835)
         at 
org.apache.nifi.avro.AvroRecordReader.nextRecord(AvroRecordReader.java:45)
         ... 22 common frames omitted
 
{noformat}
 

Pertinent code:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java

 
{noformat}
public static Map<String, Object> convertAvroRecordToMap(final GenericRecord 
avroRecord, final RecordSchema recordSchema, final Charset charset) {
final Map<String, Object> values = new HashMap<>(recordSchema.getFieldCount());
for (final RecordField recordField : recordSchema.getFields()) {
  Object value = avroRecord.get(recordField.getFieldName());
  if (value == null) {
     for (final String alias : recordField.getAliases()) {
        value = avroRecord.get(alias);  <<<<<< exception occurs <<<<<<
        if (value != null) { break; }
   }
 }
{noformat}
 

 

*GenericData$Record.get(String key):*

*1.10.0*

https://github.com/apache/avro/blob/release-1.10.0/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java

 public Object get(String key) {
{noformat}
 Field field = schema.getField(key);
 if (field == null) {
 throw new AvroRuntimeException("Not a valid schema field: " + key);
 }
 return values[field.pos()];
 }{noformat}

  

 

*Avro 1.9.2*

https://github.com/apache/avro/blob/release-1.9.2/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java

 
{noformat}
public Object get(String key) {
 Field field = schema.getField(key);
 if (field == null)
 return null;
 return values[field.pos()];
 }{noformat}
 

 

My GroovyScriptProcessor:

 
{noformat}
import org.apache.nifi.serialization.record.Record
import org.apache.commons.io.IOUtils 

def flowFile = session.get()
if (!flowFile) {   log.warn("No flowFile")} 

InputStream inputStream
def reader 

try {   
  inputStream = session.read(flowFile)   
  reader = RecordReader.reader.createRecordReader(flowFile, inputStream, 
log){noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to