[FileVault][discuss] performance improvement proposal

Timothée Maret Mon, 06 Mar 2017 07:48:58 -0800

Hi,

With Sling content distribution (using FileVault), we observe a
significantly lower throughput for content packages containing binaries.
The main bottleneck seems to be the compression algorithm applied to every
element contained in the content package.


I think that we could improve the throughput significantly, simply by
avoiding to re-compress binaries that are already compressed.
In order to figure out what binaries are already compressed, we could use
match the content type stored along the binary against a list of
configurable content types.

I have done some micro tests with this idea (patch in [0]). I think that
the results are promising.

Exporting a single 250 MB JPEG is 80% faster (22.4 sec -> 4.3 sec) for a 3%
bigger content package (233.2 MB -> 240.4 MB)
Exporting AEM OOTB /content/dam is 50% faster (11.9 sec -> 5.9 sec) for a
5% bigger content package (92.8 MB -> 97.4 MB)
Import for the same cases is 66% faster respectively 32% faster.

I think this could either be done by default and allowing to configure the
list of types that skip compression.
Alternatively, it could be done on a project level, by extending FileVault
with the following

1. For each package, allow to define the default compression level (best
compression, best speed)
2. Expose an API that allow to plugin a custom logic to decide how to
compress a given artefact

In any case, the changes would be backward compatible. Content packages
created with the new code would be installable on instances running the old
code and vice versa.

wdyt ?

Regards,

Timothee


[0]
https://github.com/tmaret/jackrabbit-filevault/tree/performance-avoid-compressing-already-compressed-binaries-based-on-content-type-detection
[1]
https://docs.oracle.com/javase/7/docs/api/java/util/zip/Deflater.html#BEST_SPEED

[FileVault][discuss] performance improvement proposal

Reply via email to