[jira] [Assigned] (NIFI-7696) MultiQueryRecord Processor

Mahieddine Cherif (Jira) Thu, 13 Aug 2020 05:58:03 -0700


     [ 
https://issues.apache.org/jira/browse/NIFI-7696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mahieddine Cherif reassigned NIFI-7696:
---------------------------------------

    Assignee: Mahieddine Cherif

> MultiQueryRecord Processor
> --------------------------
>
>                 Key: NIFI-7696
>                 URL: https://issues.apache.org/jira/browse/NIFI-7696
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mahieddine Cherif
>            Assignee: Mahieddine Cherif
>            Priority: Minor
>              Labels: enhancement, extensions, sugestion
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> *Context :*
> QueryRecord is such a nice processor, it helps everyone to perform all kind 
> of advanced queries on a wide range of data (CSV, JSON, ...thanks to the 
> RecordAPI) increasing in its way NiFi ETL capacity by a big order of 
> magnitude. 
>  I want to take that and push NiFi even further by giving it the possibility 
> to do the same thing even on +multiple FlowFiles as input+ making something 
> like performing a join like query on multiple FlowFile a reality. 
>  
> *Proposal:*
> Create a new processor called "MultiQueryRecord" which can be thought of 
> technically as a being a child of QueryRecord and a MergeRecord processor, 
> this processor will be able to take different FlowFiles from different 
> sources, wait that all of the necessary FlowFiles is expecting are here 
> before triggering and executing all the SQL queries provided as dynamic 
> properties. 
>  
>  * Every FlowFile will have an attribute which contains the name of the 
> "virtual table" that will be used in the SQL query. 
>  * The user configures how many FlowFiles is expecting also the attribute 
> name which is going to contain the table name and of course the correlation 
> attribute name to differentiate FlowFiles issued from different runs. 
>  * The user also defines of course all his SQL queries in the dynamic 
> properties (same as we do now for the QueryRecord processor.
>  
> The processor will use the same MergeBin concept as in the MergeRecord 
> processor to handle the pending FlowFiles while waiting for all of them to 
> arrive before executing all the defined SQL queries.
>  
> *Implementation:*
> I've already implemented this processor and would like to contribute to this 
> wonderful project, i'm about to finish all the unit tests and stuff and will 
> update this issue with my PR if you are interested by.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (NIFI-7696) MultiQueryRecord Processor

Reply via email to