On Sun, Mar 31, 2019 at 17:36:19 +0200, Pierre-Yves David wrote: ... > compression: introduce a `storage.revlog.zlib.level` configuration > > This option control the zlib compression level used when compression revlog > chunk. > > This is also a good excuse to pave the way for a similar configuration option > for the zstd compression engine. Having a dedicated option for each > compression > algorithm is useful because they don't support the same range of values. > > Using a higher zlib compression impact CPU consumption at compression time, > but > does not directly affected decompression time. However dealing with small > compressed chunk can directly help decompression and indirectly help other > revlog logic. > > I ran some basic test on repositories using different level. I am user the
s/user/using/ ? ... > I also made some basic timing measurement. The "read" timing are gathered > using > simple run of `hg perfrevlogrevisions`, the "write" measurement using `hg > perfrevlogwrite` (restricted to the last 5000 revisions for netbeans and > mozilla central). The timing are gathered on a generic machine, (not one of > our performance locked machine), so small variation might not be meaningful. You did more than one measurement, so measurement -> measurements, and timing -> timings? Alternatively, keep the singular but then make the verbs match: are -> is. Sorry to nit-pick, but since this text will end up in the commit messages... :) > However large trend remains relevant. > > Keep in mind that these number are not pure compression/decompression time. s/number/numbers/ > They also involve the full revlog logic. In particular the difference in chunk > size has an impact on the delta chain structure, affecting performance when > writing or reading them. > > On read/write performance, the compression level has a bigger impact. > Counter-intuitively, higher compression level raise better "write" performance s/raise better/increase/ ? This actually confuses me a bit. Based on the table below, it looks like higher compression level has non-linear effect on read/write performance. Maybe I'm not understanding what you meant by 'raise "better"'. While I expect to see a "hump" in *write* performance (because high zlib compression levels are such cpu hogs), I didn't expect to see one for *read* perfomance. I suppose the read hump could be explained by the shape of the DAG, as you point out. > for the large repositories in our tested setting. Maybe because the last 5000 > delta chain end up having a very different shape in this specific spot? Or > maybe > because of a more general trend of better delta chains thanks to the smaller > chunk and snapshot. > > This series does not intend to change the default compression level. However, > these result call for a deeper analysis of this performance difference in the > future. > > Full data > ========= > > repo level .hg/store size 00manifest.d read write > ---------------------------------------------------------------- > mercurial 1 49,402,813 5,963,475 0.170159 53.250304 > mercurial 6 47,197,397 5,875,730 0.182820 56.264320 > mercurial 9 47,121,596 5,849,781 0.189219 56.293612 > > pypy 1 370,830,572 28,462,425 2.679217 460.721984 > pypy 6 340,112,317 27,648,747 2.768691 467.537158 > pypy 9 338,360,736 27,639,003 2.763495 476.589918 > > netbeans 1 1,281,847,810 165,495,457 122.477027 520.560316 > netbeans 6 1,205,284,353 159,161,207 139.876147 715.930400 > netbeans 9 1,197,135,671 155,034,586 141.620281 678.297064 > > mozilla 1 2,775,497,186 298,527,987 147.867662 751.263721 > mozilla 6 2,596,856,420 286,597,671 170.572118 987.056093 > mozilla 9 2,587,542,494 287,018,264 163.622338 739.803002 ... > diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt > --- a/mercurial/help/config.txt > +++ b/mercurial/help/config.txt > @@ -1881,6 +1881,11 @@ category impact performance and reposito > This option is enabled by default. When disabled, it also disables the > related ``storage.revlog.reuse-external-delta-parent`` option. > > +``revlog.zlib.level`` > + Zlib compression level used when storing data into the repository. > Accepted > + Value range from 1 (lowest compression) to 9 (highest compression). Zlib > + default value is 6. I know this is very unlikely to change, but does it make sense to say what an external libarary's defaults are? Thanks for doing this! :) Jeff. -- Reality is merely an illusion, albeit a very persistent one. - Albert Einstein _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel