Thanks a lot for your help!

So I'll dig a bit on the hive side, and I'll take a look on how to compile that 
NAR. I also wasn't aware of the 1.11.0 release including all this stuff, which 
is great.


________________________________
De: Shawn Weeks <[email protected]>
Enviado: jueves, 23 de enero de 2020 23:46
Para: [email protected] <[email protected]>
Asunto: Re: ClassNotFound exceptio


PutHiveStreaming is really the only thing that should cause this because your 
bypassing Hive and writing directly to the file system. The JDBC Driver itself 
isn’t supposed to have external dependencies beyond the basic Hive ones.  I’ve 
used the default Hive Processors to connect to AWS EMR Hive, EMR Spark, and AWS 
Athena without any additional jars on 1.10.0.



Thanks

Shawn



From: Matt Burgess <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Thursday, January 23, 2020 at 4:44 PM
To: "[email protected]" <[email protected]>
Subject: Re: ClassNotFound exceptio



That's a good point Shawn, I'd seen similar issues (which is where the Jira 
came from) for PutHive3Streaming, which doesn't use the JDBC driver. Juan you 
might want to send this to the Hive users list as well, perhaps they have more 
insight as to why filesystem-specific stuff is happening on the Hive JDBC 
client side.



Thanks,

Matt



On Thu, Jan 23, 2020 at 5:39 PM Shawn Weeks 
<[email protected]<mailto:[email protected]>> wrote:

I’m pretty sure that exception is coming from Hive and not NiFi. I’m really 
struggling to see why the Hive JDBC driver needs understanding of storage when 
it’s just Thrift messages to the HiveServer2. Are you able to run these queries 
through beeline?



Thanks



From: Matt Burgess <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 23, 2020 at 2:33 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: ClassNotFound exceptio



Juan,



I'm not sure if NIFI-6912 [1] will help or not, but the imminent 1.11.0 release 
will have extra JARs (ADLS, Azure, AWS, etc.)  in the Hive 3 bundle. Since 
you're using the Hive 1 bundle, then to include such dependencies you'd have to 
add them to the nifi-hive-nar/pom.xml and build a custom NAR manually.



Regards,

Matt



[1] https://issues.apache.org/jira/browse/NIFI-6912





On Thu, Jan 23, 2020 at 11:58 AM FABIAN Juan-antonio 
<[email protected]<mailto:[email protected]>>
 wrote:

Hi,



I've migrated one flow from using HDFS to use Minio as the storage layer. 
Basically I deleted the PutHDFS processors, and now I only rely on PutS3Object. 
After writing to Minio, I got stuck with PutHiveQL. To make it work, I had to 
setup a HiveConnectionPool. I'm using the provided hive-site.xml, for NiFi to 
be aware of our Hive metastore.



Just for you to know, I started a NiFi cluster somewhere else, and configured 
git as the flow provider, so I edited the "old" flow inside the "new" NiFi 
deployment. When deploying NiFi, I manually copied the hive-site.xml to a known 
location, and it's working fine, apparently.



Even there's no PutHDFS in the "new" NiFi flow, I'm getting this error from the 
PutHiveQL processor:



2020-01-23 16:46:09,402 ERROR [Timer-Driven Process Thread-7] 
o.apache.nifi.processors.hive.PutHiveQL 
PutHiveQL[id=bff6add0-cdbc-3c2f-b79f-b3a2438139c1] Failed to process session 
due to org.apache.nifi.processor.exception.ProcessException: Failed to process 
StandardFlowFileRecord[uuid=3a089972-3b10-46bc-a880-bbf9cb322a6b,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1579621741830-937, container=default, 
section=937], offset=852885, 
length=176],offset=0,name=3441009295635454,size=176] due to 
java.sql.SQLException: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.RuntimeException: 
java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem 
not found);: org.apache.nifi.processor.exception.ProcessException: Failed to 
process 
StandardFlowFileRecord[uuid=3a089972-3b10-46bc-a880-bbf9cb322a6b,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1579621741830-937, container=default, 
section=937], offset=852885, 
length=176],offset=0,name=3441009295635454,size=176] due to 
java.sql.SQLException: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.RuntimeException: 
java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem 
not found);

org.apache.nifi.processor.exception.ProcessException: Failed to process 
StandardFlowFileRecord[uuid=3a089972-3b10-46bc-a880-bbf9cb322a6b,claim=StandardContentClaim
 [resourceClaim=StandardResourceClaim[id=1579621741830-937, container=default, 
section=937], offset=852885, 
length=176],offset=0,name=3441009295635454,size=176] due to 
java.sql.SQLException: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.RuntimeException: 
java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem 
not found);

at 
org.apache.nifi.processor.util.pattern.ExceptionHandler.lambda$createOnGroupError$2(ExceptionHandler.java:226)

at 
org.apache.nifi.processor.util.pattern.ExceptionHandler.lambda$createOnError$1(ExceptionHandler.java:179)

at 
org.apache.nifi.processor.util.pattern.ExceptionHandler$OnError.lambda$andThen$0(ExceptionHandler.java:54)

at 
org.apache.nifi.processor.util.pattern.ExceptionHandler$OnError.lambda$andThen$0(ExceptionHandler.java:54)

at 
org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:148)

at org.apache.nifi.processors.hive.PutHiveQL.lambda$new$4(PutHiveQL.java:223)

at org.apache.nifi.processor.util.pattern.Put.putFlowFiles(Put.java:59)

at org.apache.nifi.processor.util.pattern.Put.onTrigger(Put.java:102)

at 
org.apache.nifi.processors.hive.PutHiveQL.lambda$onTrigger$6(PutHiveQL.java:289)

at 
org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)

at 
org.apache.nifi.processor.util.pattern.RollbackOnFailure.onTrigger(RollbackOnFailure.java:184)

at org.apache.nifi.processors.hive.PutHiveQL.onTrigger(PutHiveQL.java:289)

at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)

at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)

at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)

at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)

at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.sql.SQLException: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:java.lang.RuntimeException: 
java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.AdlFileSystem 
not found);

at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)

at 
org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:98)

at 
org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)

at 
org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)

at org.apache.nifi.processors.hive.PutHiveQL.lambda$null$3(PutHiveQL.java:251)

at 
org.apache.nifi.processor.util.pattern.ExceptionHandler.execute(ExceptionHandler.java:127)

... 17 common frames omitted



Somewhere on the Internet I've read about some jars. So I've copied the 
required jars to /usr/lib/hdinsight-datalake, but since none of my processors 
need this, I think it was not necessary (I'm not manually referencing them). 
However, the flow is not working. The jars I copied are:



adls2-oauth2-token-provider-1.0.jar

hadoop-azure-datalake-2.7.3.2.6.5.10-2.jar

okhttp-2.7.5.jar

azure-data-lake-store-sdk-2.2.5.jar

jackson-core-2.7.8.jar

okio-1.6.0.jar



Any help or insight is really appreciated, I'm just starting my NiFi journey.



Best,





Juan A. Fabián Simón

Data Engineer

Alstom

Calle Martínez Villergas 49, ed. V - 28027 Madrid - Spain

Office: +34 91 384 89 00

Email: 
[email protected]<mailto:[email protected]>

www.alstom.com<http://www.alstom.com>

[cid:16fd491aa3c4cff311]           [cid:16fd491aa3c5b16b22] 
<https://twitter.com/Alstom>    [cid:16fd491aa3c692e333] 
<https://www.linkedin.com/company/alstom/>    [cid:16fd491aa3c7745b44] 
<https://www.facebook.com/ALSTOM/>    [cid:16fd491aa3d855d355] 
<https://www.instagram.com/alstom>    [cid:16fd491aa3d9374b66] 
<[The%20URL%20sent%20was%20deleted%20due%20to%20the%20non-respect%20of%20Alstom%20policies]>





________________________________

CONFIDENTIALITY : This e-mail and any attachments are confidential and may be 
privileged. If you are not a named recipient, please notify the sender 
immediately and do not disclose the contents to another person, use it for any 
purpose or store or copy the information in any medium.

________________________________

Some URLs of this mail have been deactivated due to non-respect of Alstom s 
policies

________________________________
CONFIDENTIALITY : This e-mail and any attachments are confidential and may be 
privileged. If you are not a named recipient, please notify the sender 
immediately and do not disclose the contents to another person, use it for any 
purpose or store or copy the information in any medium.

Reply via email to