[ 
https://issues.apache.org/jira/browse/NIFI-15548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18056919#comment-18056919
 ] 

ASF subversion and git services commented on NIFI-15548:
--------------------------------------------------------

Commit 9652fd2e3699b772644adc09b2cdde1006f457fd in nifi's branch 
refs/heads/main from Pierre Villard
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=9652fd2e36 ]

NIFI-15548 Fixed ParquetReader ClassCastException for java.time Logical Types 
(#10850)

Signed-off-by: David Handermann <[email protected]>

> ParquetReader fails on timestamps
> ---------------------------------
>
>                 Key: NIFI-15548
>                 URL: https://issues.apache.org/jira/browse/NIFI-15548
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 2.7.0, 2.7.1, 2.7.2
>            Reporter: Sven Van Kerrebroeck
>            Assignee: Pierre Villard
>            Priority: Blocker
>         Attachments: image-2026-02-04-12-58-19-172.png, parquet_test.parquet
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> It seems that since version 2.7 the ParquetReader is having problems with 
> timestamps in parquet data. 
> When reading a parquet-file with a timestamp column (timestamp_millis), an 
> error is thrown:
> !image-2026-02-04-12-58-19-172.png!
>  
> {noformat}
> ConvertRecord[id=59968c39-4fb7-324a-b1b7-bd2e4b93ccb7] Failed to process 
> FlowFile[filename=parquet_test.parquet]; will route to failure: 
> java.lang.ClassCastException: class java.time.Instant cannot be cast to class 
> java.lang.Long (java.time.Instant and java.lang.Long are in module java.base 
> of loader 'bootstrap')
> java.lang.ClassCastException: class java.time.Instant cannot be cast to class 
> java.lang.Long (java.time.Instant and java.lang.Long are in module java.base 
> of loader 'bootstrap')     at 
> org.apache.nifi.avro.AvroTypeUtil.normalizeValue(AvroTypeUtil.java:1175)     
> at 
> org.apache.nifi.avro.AvroTypeUtil.lambda$normalizeValue$3(AvroTypeUtil.java:1186)
>     at 
> org.apache.nifi.avro.AvroTypeUtil.convertUnionFieldValue(AvroTypeUtil.java:1016)
>      at 
> org.apache.nifi.avro.AvroTypeUtil.normalizeValue(AvroTypeUtil.java:1186)     
> at 
> org.apache.nifi.avro.AvroTypeUtil.convertAvroRecordToMap(AvroTypeUtil.java:979)
>       at 
> org.apache.nifi.avro.AvroTypeUtil.convertAvroRecordToMap(AvroTypeUtil.java:943)
>       at 
> org.apache.nifi.parquet.record.ParquetRecordReader.nextRecord(ParquetRecordReader.java:111)
>   at 
> org.apache.nifi.serialization.RecordReader.nextRecord(RecordReader.java:50)  
> at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:580)   at 
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:251)
>     at 
> org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler$ProxiedReturnObjectInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:237)
>        at jdk.proxy196/jdk.proxy196.$Proxy501.nextRecord(Unknown Source)      
>  at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor.lambda$onTrigger$0(AbstractRecordProcessor.java:132)
>      at 
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3410)
>  at 
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:125)
>       at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>      at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1274)
>   at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:229)
>     at 
> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:59)
>      at org.apache.nifi.engine.FlowEngine.lambda$wrap$1(FlowEngine.java:105) 
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
>     at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)   
> at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>      at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>     at java.base/java.lang.Thread.run(Thread.java:1583)
> {noformat}
> Steps to reproduce:
>  # Take the attached parquet testfile ( [^parquet_test.parquet] ) and let 
> nifi take it in (via GetFile, GetHDFS, ....). 
> (this testfile was actually produced by an ExecuteSQLRecord processor with 
> standard parquet writer)
>  # after that, add for example a ConvertRecord processor with a +standaard 
> ParquetReader+ and another writer (for exapple Json writer, doesn't really 
> matter). 
> (but any other processor that can use a parquet reader is having the same 
> issue)
>  # run the flow. An error like the one above will be shown. 
> We still have an older nifi v1 environment too (v1.23.2) and when trying the 
> exact same flow with the same file, it's working perfectly fine. 
> I found a similar mention of this problem on the Cloudera community also: 
> [https://community.cloudera.com/t5/Support-Questions/Nifi-2-7-x-seems-to-break-ParquetRecordReader-for-timestamps/m-p/413341]
> The person there confirms that in v2.6.0 it still worked as expected, but not 
> in 2.7.
> Reading parquet data is pretty important, so this is a rather blocking issue. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to