Matt Burgess created NIFI-4696:
----------------------------------

             Summary: Support concurrent tasks in PutHiveStreaming
                 Key: NIFI-4696
                 URL: https://issues.apache.org/jira/browse/NIFI-4696
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: Matt Burgess


Currently PutHiveStreaming (PHS) can only support a single task at a time. 
Before NIFI-4342, that meant each target table would need its own PHS instance, 
which can be cumbersome with large numbers of tables. After NIFI-4342, 
Expression Language could be used for SDLC purposes (database/table changes 
between development and production, e.g.).

However it would be nice to be able to support at least database/table names 
using flow file attributes, and also to support multiple tasks to handle them 
concurrently. Due to the nature of PHS and the Streaming Ingest APIs (and 
implementation), it is likely not prudent to allow two tasks to write to the 
same table and partition at the same time.

I propose adding flow file attribute EL evaluation where prudent, and allowing 
per-table concurrency in PHS. A thread will attempt to get a lock on a table, 
and if it cannot, will rollback and return.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to