Hi, I've migrated one flow from using HDFS to use Minio as the storage layer. Basically I deleted the PutHDFS processors, and now I only rely on PutS3Object. After writing to Minio, I got stuck with PutHiveQL. To make it work, I had to setup a HiveConnectionPool. I'm using the provided hive-site.xml, for NiFi to be aware of our Hive metastore.
Just for you to know, I started a NiFi cluster somewhere else, and configured git as the flow provider, so I edited the "old" flow inside the "new" NiFi deployment. When deploying NiFi, I manually copied the hive-site.xml to a known location, and it's working fine, apparently. Even there's no PutHDFS in the "new" NiFi flow, I'm getting this error from the PutHiveQL processor: 2020-01-23 16:46:09,402 ERROR [Timer-Driven Process Thread-7] o.apache.nifi.processors.hive.PutHiveQL PutHiveQL[id=bff6add0-cdbc-3c2f-b79f-b3a2438139c1] Failed to process session due to org.apache.nifi.processor.exception.ProcessException: Failed to process StandardFlowFileRecord[uuid=3a089972-3b10-46bc-a880-bbf9cb322a6b,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1579621741830-937, container=default, section=937], offset=852885, length=176],offset=0,name=3441009295635454,size=176] due to java.sql.SQLException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem not found);: org.apache.nifi.processor.exception.ProcessException: Failed to process StandardFlowFileRecord[uuid=3a089972-3b10-46bc-a880-bbf9cb322a6b,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1579621741830-937, container=default, section=937], offset=852885, length=176],offset=0,name=3441009295635454,size=176] due to java.sql.SQLException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem not found); org.apache.nifi.processor.exception.ProcessException: Failed to process StandardFlowFileRecord[uuid=3a089972-3b10-46bc-a880-bbf9cb322a6b,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1579621741830-937, container=default, section=937], offset=852885, length=176],offset=0,name=3441009295635454,size=176] due to java.sql.SQLException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem not found); at org.apache.nifi.processor.util.pattern.ExceptionHandler.lambda$createOnGroupError$2(ExceptionHandler.java:226) at org.apache.nifi.processor.util.pattern.ExceptionHandler.lambda$createOnError$1(ExceptionHandler.java:179) at org.apache.nifi.processor.util.pattern.ExceptionHandler$OnError.lambda$andThen$0(ExceptionHandler.java:54) at org.apache.nifi.processor.util.pattern.ExceptionHandler$OnError.lambda$andThen$0(ExceptionHandler.java:54) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:148) at org.apache.nifi.processors.hive.PutHiveQL.lambda$new$4(PutHiveQL.java:223) at org.apache.nifi.processor.util.pattern.Put.putFlowFiles(Put.java:59) at org.apache.nifi.processor.util.pattern.Put.onTrigger(Put.java:102) at org.apache.nifi.processors.hive.PutHiveQL.lambda$onTrigger$6(PutHiveQL.java:289) at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114) at org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184) at org.apache.nifi.processors.hive.PutHiveQL.onTrigger(PutHiveQL.java:289) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.sql.SQLException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem not found); at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296) at org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:98) at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172) at org.apache.nifi.processors.hive.PutHiveQL.lambda$null$3(PutHiveQL.java:251) at org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127) ... 17 common frames omitted Somewhere on the Internet I've read about some jars. So I've copied the required jars to /usr/lib/hdinsight-datalake, but since none of my processors need this, I think it was not necessary (I'm not manually referencing them). However, the flow is not working. The jars I copied are: adls2-oauth2-token-provider-1.0.jar hadoop-azure-datalake-2.7.3.2.6.5.10-2.jar okhttp-2.7.5.jar azure-data-lake-store-sdk-2.2.5.jar jackson-core-2.7.8.jar okio-1.6.0.jar Any help or insight is really appreciated, I'm just starting my NiFi journey. Best, Juan A. Fabián Simón Data Engineer Alstom Calle Martínez Villergas 49, ed. V - 28027 Madrid - Spain Office: +34 91 384 89 00 Email: <mailto:[email protected]> [email protected] www.alstom.com<http://www.alstom.com> [cid:6700026e-f0e5-4bb6-ba0e-0a69041ffa5f] [cid:3177f2ba-38bb-4af6-afde-6733a8e6287b] <https://twitter.com/Alstom> [cid:b21c0b94-a277-493c-9d34-2fed8b61ec96] <https://www.linkedin.com/company/alstom/> [cid:2f9a6eb0-e703-4b51-a5c0-815125cc48f5] <https://www.facebook.com/ALSTOM/> [cid:fe7d3087-9037-4694-8da6-f697d2173d71] <https://www.instagram.com/alstom> [cid:1f61345c-cf5a-48da-ae37-98a52ebbe969] <https://www.youtube.com/user/Alstom> ________________________________ CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
