Hi Matt, Approach looks good and I may suggest to use similar approach to all Hadoop related processor too but I am not sure how hard or easy to group all the Hadoop related connector to use the similar LIBPATH and CONF_PATH variables.
Thanks, Ram -----Original Message----- From: Matt Burgess [mailto:[email protected]] Sent: Friday, October 07, 2016 10:56 PM To: [email protected] Subject: Re: SelectHiveQL Error Another approach might be to allow the Hive processors to use a DBCPConnectionPool instead of requiring a HiveConnectionPool (which is a subclass). That would either involve moving the extra method (getConnectionUrl) to the DBCPService interface, or doing a check from the processors before calling getConnectionUrl() (which is used for provenance). With the former, HiveDBCPService would effectively just be a marker interface, the implementation HiveConnectionPool would remain as-is (to include a certain version of Hive, hardcode the driver name, etc. for ease of use). Then if you wanted to Bring Your Own Hive, you could set up a normal DBCPConnectionPool, add the directory containing the Hive JARs, set the driver name to org.apache.hive.jdbc.HiveDriver, and then use that in the HiveQL processors. I'll give that a try shortly to see if it's a viable option (not sure if the Hive NAR would pollute the classloader for a DBCPConnectionPool instantiated from a HiveQL processor). Being able to add additional driver JARs was added in NiFi 1.0.0 [1], and was done to support this kind of thing. However it can't be used out of the box for Hive because the SQL processors make JDBC API calls that the Hive JDBC driver doesn't support, and the HiveQL processors require a HiveConnectionPool. If we can kind of merge the two concepts (using HiveQL processors with DBCPConnectionPool services), we might be in good shape. Thoughts? Thanks, Matt [1] https://issues.apache.org/jira/browse/NIFI-2604 On Fri, Oct 7, 2016 at 10:28 PM, Andy LoPresto <[email protected]> wrote: > I don’t have all the background on this issue, but it might be > something where the solution moving forward (until the Extension > Registry is > introduced) is to follow a similar path as the Kafka connectors, i.e. > separate processors tied to each (incompatible) version of the library. > Thoughts? > > Andy LoPresto > [email protected] > [email protected] > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > > On Oct 7, 2016, at 6:26 PM, Nathamuni, Ramanujam <[email protected]> > wrote: > > Andrew, > > I agree but reality of enterprise to move one version to another is > hard and also we need to have provisions to support multiple versions > of drivers if not it will be a challenge. If we need to connect with > multiple Hadoop cluster they might be running any different versions > and asking all those Hadoop cluster owners to be on same version is going be > challenge. > > Just my opinion:-) > > Thanks, > Ram > > > ________________________________ > From: Andrew Grande > Sent: Friday, October 07, 2016 6:14:07 PM > To: [email protected] > Subject: Re: SelectHiveQL Error > > I remember this error, it basically means your Hive is too old. > There's no way to make a generic Hive client, a line has to be drawn > somewhere. Same, as e.g. a car running on premium gas won't work with regular. > > You need at least Hive 1.2. > > Andrew > > > On Fri, Oct 7, 2016, 10:20 AM Nathamuni, Ramanujam > <[email protected]> > wrote: >> >> I do have similar client protocol issue? how can we make this Hive* >> processor very generic where users can point to the LIB directory >> where it can have JAR files for Hadoop Cluster? >> >> >> >> SAS Hadoop Access connector is using below approach from their >> Enterprise Guide. >> >> >> >> - Download the JAR files from hadoop cluster >> >> - Down the config files from hadoop cluster >> >> >> >> Export two configuration variables >> >> >> >> Export HDOOOP_LIB_PATH=/opt/cdh/5.7.1/lib/ (which will have >> all the jar files) >> >> Export HADOOP_CONFIG_PATH=/opt/cdh/5.7.1/conf/ >> >> >> >> Can we have similar options on all the hadoop related processors? >> Which will make things to work with all different version of hadoop. >> >> >> >> Thanks, >> >> Ram >> >> From: Dan Giannone [mailto:[email protected]] >> Sent: Friday, October 07, 2016 9:49 AM >> >> >> To: [email protected] >> Subject: RE: SelectHiveQL Error >> >> >> >> It turns out the port needed to be changed for hive server2 as well. >> That seemed to fix the below issue. However, now I get : >> >> >> >> > org.apache.thrift.TApplicationException: Required field >> > 'client_protocol' is unset! >> >> >> >> Which according to this indicates my hive and hive-jdbc versions are >> mismatching. “Hive –-version” gives me 1.1.0. If I were to download >> the hive-jdbc 1.1.0 jar, is there a way I could specify that it us that? >> >> >> >> >> >> -Dan >> >> >> >> From: Dan Giannone [mailto:[email protected]] >> Sent: Friday, October 07, 2016 9:25 AM >> To: [email protected] >> Subject: RE: SelectHiveQL Error >> >> >> >> Hi Matt, >> >> >> >> When I try to change to jdbc:hive2://, I get a different error set of >> errors. >> >> >> >> >Error getting Hive connection >> >> >org.apache.commons.dbcp.SQLNestedException: Cannot create >> >PoolableConnectionFactory (Could not open client transport with JDBC Uri: >> > jdbc:hive2://…) >> >> >Caused by: java.sql.SQLException: Could not open client transport >> >with JDBC Uri: jdbc:hive2://… >> >> >Caused by: org.apache.thrift.transport.TTransportException: null >> >> >> >> I am thinking you are right in that it is an issue with my connection URL. >> Is there some command I can run that will generate this for me? Or a >> specific place I should look? The only mention of a url in >> hive-site.xml that I see is: >> >> >> >> <property> >> >> <name>hive.metastore.uris</name> >> >> <value>thrift://server:port</value> >> >> </property> >> >> >> >> >> >> -Dan >> >> >> >> From: Matt Burgess [mailto:[email protected]] >> Sent: Thursday, October 06, 2016 5:17 PM >> To: [email protected] >> Subject: Re: SelectHiveQL Error >> >> >> >> Andrew is correct. Although the HiveServer >> >> 1 driver is included with the NAR, the HiveConnectionPool is >> hardcoded to use the HiveServer 2 driver (since the former doesn't >> allow for simultaneous connections and we are using a connection pool >> :) the scheme should be jdbc:hive2:// not hive. >> >> >> >> If that was a typo and you are using the correct scheme, could you >> provide your configuration details/properties? >> >> >> >> Thanks, >> >> Matt >> >> >> >> >> On Oct 6, 2016, at 4:07 PM, Andrew Grande <[email protected]> wrote: >> >> Are you sure the jdbc url is correct? Iirc, it was jdbc:hive2:// >> >> Andrew >> >> >> >> On Thu, Oct 6, 2016, 3:46 PM Dan Giannone <[email protected]> wrote: >> >> Hi Matt, >> >> Here is the whole error trace, starting from when I turned on the >> SelectHiveQL processor: >> >> INFO [StandardProcessScheduler Thread-2] >> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled >> SelectHiveQL[id=0157102a-94da-11ec-0f7e-17fd3119aa00] to run with 1 >> threads >> 2016-10-06 15:37:06,554 INFO [Timer-Driven Process Thread-7] >> o.a.nifi.dbcp.hive.HiveConnectionPool >> HiveConnectionPool[id=0157102d-94da-11ec-4d91-5a8952e888bd] Simple >> Authentication >> 2016-10-06 15:37:06,556 ERROR [Timer-Driven Process Thread-7] >> o.a.nifi.dbcp.hive.HiveConnectionPool >> HiveConnectionPool[id=0157102d-94da-11ec-4d91-5a8952e888bd] Error >> getting Hive connection >> 2016-10-06 15:37:06,557 ERROR [Timer-Driven Process Thread-7] >> o.a.nifi.dbcp.hive.HiveConnectionPool >> org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver >> of class 'org.apache.hive.jdbc.HiveDriver' for connect URL >> 'jdbc:hive://server:port/default' >> at >> org.apache.commons.dbcp.BasicDataSource.createConnectionFactory(Basic >> DataSource.java:1452) >> ~[commons-dbcp-1.4.jar:1.4] >> at >> org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSou >> rce.java:1371) >> ~[commons-dbcp-1.4.jar:1.4] >> at >> org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource >> .java:1044) >> ~[commons-dbcp-1.4.jar:1.4] >> at >> org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnec >> tionPool.java:269) ~[nifi-hive-processors-1.0.0.jar:1.0.0] >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) ~[na:1.8.0_45] >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. >> java:62) >> ~[na:1.8.0_45] >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces >> sorImpl.java:43) >> ~[na:1.8.0_45] >> at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_45] >> at >> org.apache.nifi.controller.service.StandardControllerServiceProvider$ >> 1.invoke(StandardControllerServiceProvider.java:177) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at com.sun.proxy.$Proxy81.getConnection(Unknown Source) [na:na] >> at >> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.j >> ava:158) [nifi-hive-processors-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcess >> or.java:27) >> [nifi-api-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardPr >> ocessorNode.java:1064) [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(Con >> tinuallyRunProcessorTask.java:136) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(Con >> tinuallyRunProcessorTask.java:47) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.ru >> n(TimerDrivenSchedulingAgent.java:132) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:51 >> 1) >> [na:1.8.0_45] >> at >> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >> [na:1.8.0_45] >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask. >> access$301(ScheduledThreadPoolExecutor.java:180) >> [na:1.8.0_45] >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask. >> run(ScheduledThreadPoolExecutor.java:294) >> [na:1.8.0_45] >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. >> java:1142) >> [na:1.8.0_45] >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor >> .java:617) >> [na:1.8.0_45] >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] Caused >> by: java.sql.SQLException: No suitable driver >> at java.sql.DriverManager.getDriver(DriverManager.java:315) >> ~[na:1.8.0_45] >> at >> org.apache.commons.dbcp.BasicDataSource.createConnectionFactory(Basic >> DataSource.java:1437) >> ~[commons-dbcp-1.4.jar:1.4] >> ... 22 common frames omitted >> 2016-10-06 15:37:06,557 ERROR [Timer-Driven Process Thread-7] >> o.a.nifi.processors.hive.SelectHiveQL >> SelectHiveQL[id=0157102a-94da-11ec-0f7e-17fd3119aa00] Unable to >> execute HiveQL select query select * from dan_test due to >> org.apache.nifi.processor.exception.ProcessException: >> org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver >> of class 'org.apache.hive.jdbc.HiveDriver' for connect URL >> 'jdbc:hive://server:port/default'. No FlowFile to route to failure: >> org.apache.nifi.processor.exception.ProcessException: >> org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver >> of class 'org.apache.hive.jdbc.HiveDriver' for connect URL >> 'jdbc:hive://server:port/default' >> 2016-10-06 15:37:06,558 ERROR [Timer-Driven Process Thread-7] >> o.a.nifi.processors.hive.SelectHiveQL >> org.apache.nifi.processor.exception.ProcessException: >> org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC driver >> of class 'org.apache.hive.jdbc.HiveDriver' for connect URL >> 'jdbc:hive://server:port/default' >> at >> org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnec >> tionPool.java:273) ~[nifi-hive-processors-1.0.0.jar:1.0.0] >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) ~[na:1.8.0_45] >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. >> java:62) >> ~[na:1.8.0_45] >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces >> sorImpl.java:43) >> ~[na:1.8.0_45] >> at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_45] >> at >> org.apache.nifi.controller.service.StandardControllerServiceProvider$ >> 1.invoke(StandardControllerServiceProvider.java:177) >> ~[na:na] >> at com.sun.proxy.$Proxy81.getConnection(Unknown Source) ~[na:na] >> at >> org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.j >> ava:158) ~[nifi-hive-processors-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcess >> or.java:27) >> [nifi-api-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardPr >> ocessorNode.java:1064) [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(Con >> tinuallyRunProcessorTask.java:136) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(Con >> tinuallyRunProcessorTask.java:47) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.ru >> n(TimerDrivenSchedulingAgent.java:132) >> [nifi-framework-core-1.0.0.jar:1.0.0] >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:51 >> 1) >> [na:1.8.0_45] >> at >> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >> [na:1.8.0_45] >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask. >> access$301(ScheduledThreadPoolExecutor.java:180) >> [na:1.8.0_45] >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask. >> run(ScheduledThreadPoolExecutor.java:294) >> [na:1.8.0_45] >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. >> java:1142) >> [na:1.8.0_45] >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor >> .java:617) >> [na:1.8.0_45] >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] Caused >> by: org.apache.commons.dbcp.SQLNestedException: Cannot create JDBC >> driver of class 'org.apache.hive.jdbc.HiveDriver' for connect URL >> 'jdbc:hive://server:port/default' >> at >> org.apache.commons.dbcp.BasicDataSource.createConnectionFactory(Basic >> DataSource.java:1452) >> ~[commons-dbcp-1.4.jar:1.4] >> at >> org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSou >> rce.java:1371) >> ~[commons-dbcp-1.4.jar:1.4] >> at >> org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource >> .java:1044) >> ~[commons-dbcp-1.4.jar:1.4] >> at >> org.apache.nifi.dbcp.hive.HiveConnectionPool.getConnection(HiveConnec >> tionPool.java:269) ~[nifi-hive-processors-1.0.0.jar:1.0.0] >> ... 19 common frames omitted >> Caused by: java.sql.SQLException: No suitable driver >> at java.sql.DriverManager.getDriver(DriverManager.java:315) >> ~[na:1.8.0_45] >> at >> org.apache.commons.dbcp.BasicDataSource.createConnectionFactory(Basic >> DataSource.java:1437) >> ~[commons-dbcp-1.4.jar:1.4] >> ... 22 common frames omitted >> >> >> Sorry about the formatting. Any ideas? Is there some way I can edit >> the Hive Nar file? >> >> -Dan >> >> >> >> >> >> >> >> -----Original Message----- >> From: Matt Burgess [mailto:[email protected]] >> Sent: Thursday, October 06, 2016 3:24 PM >> To: [email protected] >> Subject: Re: SelectHiveQL Error >> >> Dan, >> >> That is a catch-all error returned when (in case probably) something >> is misconfigured. Are there more error lines below that in the log? >> The driver class and all its dependencies are present in the Hive >> NAR, so there is likely an underlying error that, while being >> propagated up, returns the generic (and misleading) error message you >> describe. >> >> Regards, >> Matt >> >> On Thu, Oct 6, 2016 at 3:14 PM, Dan Giannone <[email protected]> wrote: >> > Hello, >> > >> > >> > >> > I am trying to use a SelectHiveQL processor using the following >> > controller >> > configuration: >> > >> > >> > >> > Database Connection URL - jdbc:hive://server:port/default >> > >> > Hive Configuration Resources - /path/to/hive/hive-site.xml >> > >> > >> > >> > When my processor goes to run the simple query, I get the following >> > error (copying the important parts to save space): >> > >> > >> > >> >>Unable to execute HiveQL select query select * from table >> > >> >>Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot >> >>create JDBC driver of class 'org.apache.hive.jdbc.HiveDriver' for >> >>connect URL ‘<see >> >> above>’ >> > >> >>Caused by: java.sql.SQLException: No suitable driver >> > >> > >> > >> > I assume I am missing something in hive-site.xml, or a jar file? >> > Any insight would be appreciated. >> > >> > >> > >> > Thanks, >> > >> > >> > >> > Dan >> > >> > >> > The information transmitted is intended only for the person or >> > entity to which it is addressed and may contain CONFIDENTIAL >> > material. If you receive this material/information in error, please >> > contact the sender and delete or destroy the material/information. >> >> The information transmitted is intended only for the person or entity >> to which it is addressed and may contain CONFIDENTIAL material. If >> you receive this material/information in error, please contact the >> sender and delete or destroy the material/information. >> >> >> The information transmitted is intended only for the person or entity >> to which it is addressed and may contain CONFIDENTIAL material. If >> you receive this material/information in error, please contact the >> sender and delete or destroy the material/information. >> >> >> The information transmitted is intended only for the person or entity >> to which it is addressed and may contain CONFIDENTIAL material. If >> you receive this material/information in error, please contact the >> sender and delete or destroy the material/information. >> >> >> ********************************************************************* >> **** This e-mail may contain confidential or privileged information. >> If you are not the intended recipient, please notify the sender >> immediately and then delete it. >> >> TIAA >> ********************************************************************* >> **** > > > ********************************************************************** > *** This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender > immediately and then delete it. > > TIAA > ********************************************************************** > *** > > ************************************************************************* This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA *************************************************************************
