Matt Burgess created NIFI-11466:
-----------------------------------
Summary: Add a ModifyCompression processor
Key: NIFI-11466
URL: https://issues.apache.org/jira/browse/NIFI-11466
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions
Reporter: Matt Burgess
Fix For: 2.latest
If a user would like to convert from one compression format to another, they
currently have to use CompressContent to decompress, then another
CompressContent to compress into a different format. Two processors plus disk
I/O for the FlowFiles and their underlying content claims can be I/O intensive
in that case.
Instead, a new ModifyCompression processor is proposed, to allow for both
decompression of the incoming FlowFile and compression for the outgoing
FlowFile, using appropriate memory buffers for the decompression/recompression.
Adding "no decompression" and "no compression" options for the respective
properties could allow this property to function like CompressContent does now,
plus the ability to convert from one compression format (gzip, e.g.) to another
(snappy-hadoop, e.g.). One example of a use case where this would be helpful is
an I/O bound flow to get compressed data from a legacy source system into HDFS
for faster (and larger-volume / distributed) processing of the data.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)