[
https://issues.apache.org/jira/browse/NIFI-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Burgess updated NIFI-4545:
-------------------------------
Description:
Hive-related processors report NiFi provenance events with transit URLs in a
format as 'jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName'. The URL format
can identify a Hive environment, but not descriptive enough to derive actual
table names affecting or being affected by the query which generated the
provenance event.
Those table information can only be known by parsing query. This JIRA improves
following Hive related processors to write additional 'query.input.tables' and
'query.output.tables' FlowFile attributes by parsing Hive queries using Hive
parser.
Target Processors:
* PutHiveQL
* SelectHiveQL
* PutHiveStreaming: This processor knows a table name without the need of
parsing queries.
was:
HBase related processors report NiFi provenance events with transit URLs in a
format as 'jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName'. The URL format
can identify a Hive environment, but not descriptive enough to derive actual
table names affecting or being affected by the query which generated the
provenance event.
Those table information can only be known by parsing query. This JIRA improves
following Hive related processors to write additional 'query.input.tables' and
'query.output.tables' FlowFile attributes by parsing Hive queries using Hive
parser.
Target Processors:
* PutHiveQL
* SelectHiveQL
* PutHiveStreaming: This processor knows a table name without the need of
parsing queries.
> Improve Hive processors provenance transit URL
> ----------------------------------------------
>
> Key: NIFI-4545
> URL: https://issues.apache.org/jira/browse/NIFI-4545
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Koji Kawamura
> Assignee: Koji Kawamura
> Priority: Major
>
> Hive-related processors report NiFi provenance events with transit URLs in a
> format as 'jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName'. The URL
> format can identify a Hive environment, but not descriptive enough to derive
> actual table names affecting or being affected by the query which generated
> the provenance event.
> Those table information can only be known by parsing query. This JIRA
> improves following Hive related processors to write additional
> 'query.input.tables' and 'query.output.tables' FlowFile attributes by parsing
> Hive queries using Hive parser.
> Target Processors:
> * PutHiveQL
> * SelectHiveQL
> * PutHiveStreaming: This processor knows a table name without the need of
> parsing queries.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)