Boris,

Historically the Hive JDBC drivers (at least around 1.2.x) did not
support some of the JDBC API methods called from ExecuteSQL/PutSQL,
namely setQueryTimeout(), executeBatch(), etc. Also the column names
are returned from the metadata with the table name prepended, so we'd
need special logic for the Hive case, and currently we don't have to
specify the database type in those processors. There have been some
Jiras/PRs around adding Hive support for the SQL processors, but they
haven't yet been incorporated. The Apache Hive driver still doesn't
support executeBatch(), so PutSQL and PutDatabaseRecord wouldn't be
able to call that. Again we'd need a Hive database adapter and would
ask the specified adapter if batching is supported, then go along some
other logic route if not. It should be doable, but just hasn't been
finished yet.

Regards,
Matt

On Thu, Aug 16, 2018 at 12:14 PM Boris Tyukin <[email protected]> wrote:
>
> awesome, passing property to impala connection string did the trick! thanks 
> Ed!
>
> Does it still make sense to open Jira to support multiple statements for 
> non-Hive processors?
>
> Totally separate subject, since you mentioned Hive - is there a reason why I 
> cannot connect to Hive and run basic SQL using ExecuteSQL/PutSQL? Not sure 
> why dedicated processors exist just for Hive. Does it not work over generic 
> DBCP connection pool? We've done some research yesterday when we were looking 
> for options on how to run multiple queries but because we are on CDH, Hive 
> jars included with NiFi do not work for us. Looks like we can compile CDH 
> specific NiFi but I am wondering why bother if we can just access Hive using 
> generic pool
>
> On Thu, Aug 16, 2018 at 11:42 AM Ed B <[email protected]> wrote:
>>
>> Hi Boris,
>> True, multi-statements support has been added to SelectHiveQL, but not to 
>> ExecuteSQL/PutSQL.
>> As a workaround you could try to specify config params on JDBC connection 
>> string when you define controller service:
>>
>> jdbc:impala://Host:Port[/Schema];Property1=Value;Property2=Value;...
>>
>> If you can use Hive driver to access Impala:
>>  
>> jdbc:hive2://host:port/;principal=principal_name;Property1=Value;Property2=Value;...
>>
>> Some params will be working after "?" in URL:
>>  
>> jdbc:hive2://host:port/;principal=principal_name?Property1=Value;Property2=Value;...
>>
>> But, if you really can use Hive drivers, then you can use PutHiveQL and 
>> ExecuteHiveQL. Both support multi-statements.
>>
>> Regards.
>> Ed.
>>
>> On Wed, Aug 15, 2018 at 11:11 AM Boris Tyukin <[email protected]> wrote:
>>>
>>> Hi guys,
>>>
>>> I need to issue a query like below on Impala. it works fine from 
>>> impala-shell but NiFi seems not to like multiple statements like that.
>>>
>>> set max_row_size=7mb;
>>>
>>> create table blabla as
>>> select blabla
>>> from blablabla;
>>>
>>>
>>> I thought it was addressed in 1.7 but I got it confused with Hive 
>>> processors and this Jira
>>> https://issues.apache.org/jira/browse/NIFI-5044
>>>
>>> Is there something in work already to address it or should I open a new 
>>> Jira?
>>>
>>> For now, I am going to use Groovy script but if anyone knows a better 
>>> workaround, please let me know.
>>>
>>> Boris

Reply via email to