[ https://issues.apache.org/jira/browse/NIFI-10556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Liszli resolved NIFI-10556. ---------------------------------- Resolution: Won't Do > Create processor to support DeltaLake tables > -------------------------------------------- > > Key: NIFI-10556 > URL: https://issues.apache.org/jira/browse/NIFI-10556 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions > Reporter: Robert Liszli > Assignee: Robert Liszli > Priority: Major > Attachments: processor_usages.png > > Time Spent: 4h 10m > Remaining Estimate: 0h > > *Plan for the new processor* > The new processor will use the Delta Standalone library to generate the delta > table for the parquet data files. This processor also capable to process > other processors output file and upload it to the data store. > *Processors input:* > * The path of the parquet files(a single directory). Located at local > filesystem or in cloud storage(S3, GCP or Azure). > * Structure of the parquet file in json format. > * If we want the processor to process other processors output file, then the > attribute names of the output files path and filename should be set > * Partition columns, separated by comma > *Processors parameter:* > * Dropdown selector for storage type selection. > * Credentials for the selected storage type. > *On Trigger:* > * If we want the processor to process other processors output file, first it > copies the new file to the desired data directory. > * The processor will compare the files in the data directory to the files > already added to the delta table. If new data file exist, it will add it to > the delta table. > * If there is no delta table exists, the processor will create one and the > delta table will be generated. > *Output of the processor:* > * Up to date Delta table in the chosen storage system. > > Delta Standalone: [https://github.com/delta-io/connectors#delta-standalone] -- This message was sent by Atlassian Jira (v8.20.10#820010)