[ 
https://issues.apache.org/jira/browse/NIFI-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389664#comment-15389664
 ] 

ASF GitHub Bot commented on NIFI-1868:
--------------------------------------

GitHub user mattyb149 opened a pull request:

    https://github.com/apache/nifi/pull/706

    NIFI-1868: Add PutHiveStreaming processor

    The second commit is to (temporarily) remove ConvertAvroToORC (and all ORC 
references). The Hive processors (for 1.0) must work with Hive 1.2.1, which was 
before ORC was split into its own Apache project. In order for PutHiveStreaming 
(and the rest of the bundle) to compile against Hive 1.2.1, the Hive version 
was downgraded and all ORC references were removed.
    
    NIFI-1663 was reopened to refactor ConvertAvroToORC to use hive-orc in Hive 
1.2.1 rather than Apache ORC. That Jira will restore the ConvertAvroToORC 
processor.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mattyb149/nifi NIFI-1868_1.0

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/706.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #706
    
----
commit 21ff63b29594e32361ad5ceb28885263f75957c8
Author: Matt Burgess <[email protected]>
Date:   2016-07-21T15:59:41Z

    NIFI-1868: Add PutHiveStreaming processor

commit 4328d4dea1a81f342fa6e521235621ba526c7301
Author: Matt Burgess <[email protected]>
Date:   2016-07-22T15:14:16Z

    NIFI-1868: Downgrade to Hive 1.2.1 and remove ConvertAvroToORC

----


> Add support for Hive Streaming
> ------------------------------
>
>                 Key: NIFI-1868
>                 URL: https://issues.apache.org/jira/browse/NIFI-1868
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>             Fix For: 1.0.0
>
>
> Traditionally adding new data into Hive requires gathering a large amount of 
> data onto HDFS and then periodically adding a new partition. This is 
> essentially a “batch insertion”. Insertion of new data into an existing 
> partition is not permitted. Hive Streaming API allows data to be pumped 
> continuously into Hive. The incoming data can be continuously committed in 
> small batches of records into an existing Hive partition or table. Once data 
> is committed it becomes immediately visible to all Hive queries initiated 
> subsequently.
> This case is to add a PutHiveStreaming processor to NiFi, to leverage the 
> Hive Streaming API to allow continuous streaming of data into a Hive 
> partition/table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to