[ https://issues.apache.org/jira/browse/NIFI-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Pierce updated NIFI-6964: ------------------------------ Description: The CompressContent processor does not use the Compression Level property of the processor except for when using the GZIP compression format. On the contrary, the xz-lzma2 compression format defaults to using XZ compression level 6 for that specific format (I read the CompressContent.java source code to verify this) – disregarding whatever compression level you set on the processor itself. As a side note, the xz compression format supports, amazingly enough, 10 levels of compression from 0 to 9 – the same as GZIP. The only difference that I can tell is level 0 of xz is not the lack of compression, but the lightest compression possible (i.e. still some compression) – whereas GZIP compression level 0 means just container the content but do not compress. I have a use case where I must use the xz-lzma2 format (don't ask why) and I have to send (using the XZ format) already highly-compressed content that is +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of already highly compressed content to further compress into the XZ format on a daily basis. The attached patch will enhance the CompressContent.java source code enabling the compression level property to be used in both the GZIP and the XZ-LZMA2 formats. Please consider adding this patch to the baseline for this processor. I've tested it and the results are fantastic because I can crank down the compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally seeing a 66% improvement in elapsed time to process highly compressed content using XZ format with compression level of 0 versus the hard-coded level 6 of the baseline code. was: The CompressContent processor does not use the Compression Level property of the processor except for when using the GZIP compression format. On the contrary, the xz-lzma2 compression format defaults to using XZ compression level 6 for that specific format (I read the CompressContent.java source code to verify this) – disregarding whatever compression level you set on the processor itself. I have a use case where I must use the xz-lzma2 format (don't ask why) and I have to send (using the XZ format) already highly-compressed content that is +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of already highly compressed content to further compress into the XZ format on a daily basis. The attached patch will enhance the CompressContent.java source code enabling the compression level property to be used in both the GZIP and the XZ-LZMA2 formats. Please consider adding this patch to the baseline for this processor. I've tested it and the results are fantastic because I can crank down the compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally seeing a 66% improvement in elapsed time to process highly compressed content using XZ format with compression level of 0 versus the hard-coded level 6 of the baseline code. Labels: compression xz-lzma2 (was: xz-lzma2) > Use compression level for xz-lzma2 format of the CompressContent processor > -------------------------------------------------------------------------- > > Key: NIFI-6964 > URL: https://issues.apache.org/jira/browse/NIFI-6964 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework > Affects Versions: 1.10.0 > Reporter: John Pierce > Priority: Minor > Labels: compression, xz-lzma2 > Fix For: 1.11.0 > > Original Estimate: 4h > Time Spent: 10m > Remaining Estimate: 3h 50m > > The CompressContent processor does not use the Compression Level property of > the processor except for when using the GZIP compression format. On the > contrary, the xz-lzma2 compression format defaults to using XZ compression > level 6 for that specific format (I read the CompressContent.java source code > to verify this) – disregarding whatever compression level you set on the > processor itself. > As a side note, the xz compression format supports, amazingly enough, 10 > levels of compression from 0 to 9 – the same as GZIP. The only difference > that I can tell is level 0 of xz is not the lack of compression, but the > lightest compression possible (i.e. still some compression) – whereas GZIP > compression level 0 means just container the content but do not compress. > I have a use case where I must use the xz-lzma2 format (don't ask why) and I > have to send (using the XZ format) already highly-compressed content that is > +*NOT*+ XZ format to begin with. I have in excess of 500 GB of this sort of > already highly compressed content to further compress into the XZ format on a > daily basis. > The attached patch will enhance the CompressContent.java source code enabling > the compression level property to be used in both the GZIP and the XZ-LZMA2 > formats. > Please consider adding this patch to the baseline for this processor. I've > tested it and the results are fantastic because I can crank down the > compression level to 0 for XZ-LZMA2 now and use a lot less CPU. I'm generally > seeing a 66% improvement in elapsed time to process highly compressed content > using XZ format with compression level of 0 versus the hard-coded level 6 of > the baseline code. > -- This message was sent by Atlassian Jira (v8.3.4#803005)