Hi Mandeep,

Thanks for reporting this issue! Koji filed the JIRA [1] and submitted a PR
for it [2]. I just merged it into master and it will be released with NiFi
1.9.0. You can also build the standard processors NAR from the master
branch if you need the fix quickly.

[1] https://issues.apache.org/jira/browse/NIFI-5802
[2] https://github.com/apache/nifi/pull/3158

Pierre

Le mer. 7 nov. 2018 à 12:54, Mandeep Gill <mand...@nstack.com> a écrit :

> Hi,
>
> We're hitting a couple of issues working with nulls when using QueryRecord
> using both NiFi 1.7.1 and 1.8.0.
>
> Things work as expected for strings, however when using other primitive
> types as defined by the avro schema, such as boolean, long, and double,
> null values in the input data aren't converted to NULLs within the SQL
> engine / Calcite. Instead they appear to remain as java null values and
> throw NPEs when attempting to use them within a query or simply return them
> as the output.
>
> To give some examples, given the following record data and schema (tested
> using both JSON and Avro record reader/writers)
>
> [ {  "str_test" : "hello1",  "bool_test" : true }, {  "str_test" : null,  
> "bool_test" : null } ]
>
> {
>   "type": "record",
>   "name": "schema",
>   "fields": [
>     {
>       "name": "str_test",
>       "type": [ "string", "null" ],
>       "default": null
>     },
>     {
>       "name": "bool_test",
>       "type": [ "boolean", "null" ],
>       "default": null
>     }
>   ]
> }
>
> The following queries return the empty resultset,
>
> select 'res' as res from FLOWFILE where bool_test IS NULL
> select 'res' as res from FLOWFILE where bool_test IS UNKNOWN
>
> and the query below returns a resultset of count 2,
>
> select 'res' from FLOWFILE where bool_test IS NOT NULL
>
> The query below works as expected, suggesting things work fine for strings
>
> select 'res' as res from FLOWFILE where str_test IS NULL
>
> However, finally the following query throws a NullPointerException (see
> [1]) on trying to convert the null to a boolean within the output writer
>
> select * from FLOWFILE where bool_test IS NOT NULL
>
> The null values for these types seem to be treated as distinct to the
> NULLs within the SQL engine, as the following query returns the empty
> resultset.
>
> select 'res' as res from FLOWFILE where CAST(NULL as boolean) IS DISTINCT 
> FROM bool_test
>
> and the following query gives an RuntimeException (see [2]),
>
> select (COALESCE(bool_test, TRUE)) as res from flowfile
>
> Given all this we're unable to make use of datasets with nulls, are nulls
> only supported for strings or is there perhaps something we're doing wrong
> here in our setup/config. One thing we've noticed when running a simple
> "SELECT * from FLOWFILE" returns a nullable type for strings in the output
> avro schema but not for other primitives, even if they were nullable in the
> input schema - which could be related.
>
> Cheers,
> Mandeep
>
> [1] org.apache.nifi.processor.exception.ProcessException: IOException
> thrown from QueryRecord[id=43ee29ff-0166-1000-28bd-06dd07c1425d]:
> java.io.IOException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> java.lang.NullPointerException: null of boolean in field bool_test of
> org.apache.nifi.nifiRecord
> at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2667)
> at
> org.apache.nifi.processors.standard.QueryRecord.onTrigger(QueryRecord.java:309)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> java.lang.NullPointerException: null of boolean in field bool_test of
> org.apache.nifi.nifiRecord
> at
> org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:327)
> at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2648)
> ... 12 common frames omitted
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> java.lang.NullPointerException: null of boolean in field bool_test of
> org.apache.nifi.nifiRecord
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308)
> at
> org.apache.nifi.avro.WriteAvroResultWithSchema.writeRecord(WriteAvroResultWithSchema.java:61)
> at
> org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
> at
> org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:52)
> at
> org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:324)
> ... 13 common frames omitted
> Caused by: java.lang.NullPointerException: null of boolean in field
> bool_test of org.apache.nifi.nifiRecord
> at
> org.apache.avro.generic.GenericDatumWriter.npe(GenericDatumWriter.java:132)
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:126)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302)
> ... 17 common frames omitted
> Caused by: java.lang.NullPointerException: null
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:121)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
> ... 20 common frames omitted
>
>
> [2] org.apache.nifi.processor.exception.ProcessException: IOException
> thrown from QueryRecord[id=43ee29ff-0166-1000-28bd-06dd07c1425d]:
> java.io.IOException: java.lang.RuntimeException: Cannot convert null to
> boolean
> at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2667)
> at
> org.apache.nifi.processors.standard.QueryRecord.onTrigger(QueryRecord.java:309)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cannot convert
> null to boolean
> at
> org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:327)
> at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2648)
> ... 12 common frames omitted
> Caused by: java.lang.RuntimeException: Cannot convert null to boolean
> at
> org.apache.calcite.runtime.SqlFunctions.cannotConvert(SqlFunctions.java:1460)
> at
> org.apache.calcite.runtime.SqlFunctions.toBoolean(SqlFunctions.java:1483)
> at Baz$1$1.current(Unknown Source)
> at
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:684)
> at
> org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
> at
> org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:217)
> at
> org.apache.nifi.serialization.record.ResultSetRecordSet.next(ResultSetRecordSet.java:84)
> at
> org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:51)
> at
> org.apache.nifi.processors.standard.QueryRecord$1.process(QueryRecord.java:324)
> ... 13 common frames omitted
>
> --
>
> Mandeep Gill
>
> nstack.com <http://www.nstack.com/> / +44 7961822575 <+44%207961%20822575>
>

Reply via email to