[
https://issues.apache.org/jira/browse/NIFI-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404509#comment-15404509
]
ASF GitHub Bot commented on NIFI-1868:
--------------------------------------
Github user bbende commented on the issue:
https://github.com/apache/nifi/pull/706
Latest update is looking good... one thing I noticed, if you send in an
Avro file that does not have the partition columns of the table, it throws an
IOException around line 435 when trying to extract the partition fields from
the Avro schema, but then it gets wrapped in a ProcessException and thrown out
of onTrigger so the flow file sits in the incoming queue but can never be
processed.
Could we look for ProcessException with a cause of IOException and route to
failure (similar to the connection error handling)? or maybe create a specific
exception type to look for since there could be other IOExceptions that we want
to bounce out of onTrigger?
> Add support for Hive Streaming
> ------------------------------
>
> Key: NIFI-1868
> URL: https://issues.apache.org/jira/browse/NIFI-1868
> Project: Apache NiFi
> Issue Type: New Feature
> Reporter: Matt Burgess
> Assignee: Matt Burgess
> Fix For: 1.0.0
>
>
> Traditionally adding new data into Hive requires gathering a large amount of
> data onto HDFS and then periodically adding a new partition. This is
> essentially a “batch insertion”. Insertion of new data into an existing
> partition is not permitted. Hive Streaming API allows data to be pumped
> continuously into Hive. The incoming data can be continuously committed in
> small batches of records into an existing Hive partition or table. Once data
> is committed it becomes immediately visible to all Hive queries initiated
> subsequently.
> This case is to add a PutHiveStreaming processor to NiFi, to leverage the
> Hive Streaming API to allow continuous streaming of data into a Hive
> partition/table.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)