[ 
https://issues.apache.org/jira/browse/NIFI-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Liszli resolved NIFI-10556.
----------------------------------
    Resolution: Won't Do

> Create processor to support DeltaLake tables
> --------------------------------------------
>
>                 Key: NIFI-10556
>                 URL: https://issues.apache.org/jira/browse/NIFI-10556
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Robert Liszli
>            Assignee: Robert Liszli
>            Priority: Major
>         Attachments: processor_usages.png
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> *Plan for the new processor*
> The new processor will use the Delta Standalone library to generate the delta 
> table for the parquet data files. This processor also capable to process 
> other processors output file and upload it to the data store.
> *Processors input:*
>  * The path of the parquet files(a single directory). Located at local 
> filesystem or in cloud storage(S3, GCP or Azure).
>  * Structure of the parquet file in json format.
>  * If we want the processor to process other processors output file, then the 
> attribute names of the output files path and filename should be set
>  * Partition columns, separated by comma
> *Processors parameter:*
>  * Dropdown selector for storage type selection.
>  * Credentials for the selected storage type.
> *On Trigger:*
>  * If we want the processor to process other processors output file, first it 
> copies the new file to the desired data directory.
>  * The processor will compare the files in the data directory to the files 
> already added to the delta table. If new data file exist, it will add it to 
> the delta table.
>  * If there is no delta table exists, the processor will create one and the 
> delta table will be generated.
> *Output of the processor:*
>  * Up to date Delta table in the chosen storage system.
>  
> Delta Standalone: [https://github.com/delta-io/connectors#delta-standalone]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to