On Sun, Mar 31, 2019 at 17:36:19 +0200, Pierre-Yves David wrote:
...
> compression: introduce a `storage.revlog.zlib.level` configuration
> 
> This option control the zlib compression level used when compression revlog
> chunk.
> 
> This is also a good excuse to pave the way for a similar configuration option
> for the zstd compression engine. Having a dedicated option for each 
> compression
> algorithm is useful because they don't support the same range of values.
> 
> Using a higher zlib compression impact CPU consumption at compression time, 
> but
> does not directly affected decompression time. However dealing with small
> compressed chunk can directly help decompression and indirectly help other
> revlog logic.
> 
> I ran some basic test on repositories using different level. I am user the

s/user/using/ ?

...
> I also made some basic timing measurement. The "read" timing are gathered 
> using
> simple run of `hg perfrevlogrevisions`, the "write" measurement using `hg
> perfrevlogwrite` (restricted to the last 5000 revisions for netbeans and
> mozilla central). The timing are gathered on a generic machine, (not one  of
> our performance locked machine), so small variation might not be meaningful.

You did more than one measurement, so measurement -> measurements, and
timing -> timings?  Alternatively, keep the singular but then make the verbs
match: are -> is.

Sorry to nit-pick, but since this text will end up in the commit messages...
:)

> However large trend remains relevant.
> 
> Keep in mind that these number are not pure compression/decompression time.

s/number/numbers/

> They also involve the full revlog logic. In particular the difference in chunk
> size has an impact on the delta chain structure, affecting performance when
> writing or reading them.
> 
> On read/write performance, the compression level has a bigger impact.
> Counter-intuitively, higher compression level raise better "write" performance

s/raise better/increase/ ?

This actually confuses me a bit.  Based on the table below, it looks like
higher compression level has non-linear effect on read/write performance.
Maybe I'm not understanding what you meant by 'raise "better"'.

While I expect to see a "hump" in *write* performance (because high zlib
compression levels are such cpu hogs), I didn't expect to see one for *read*
perfomance.  I suppose the read hump could be explained by the shape of the
DAG, as you point out.

> for the large repositories in our tested setting. Maybe because the last 5000
> delta chain end up having a very different shape in this specific spot? Or 
> maybe
> because of a more general trend of better delta chains thanks to the smaller
> chunk and snapshot.
> 
> This series does not intend to change the default compression level. However,
> these result call for a deeper analysis of this performance difference in the
> future.
> 
> Full data
> =========
> 
> repo   level  .hg/store size  00manifest.d read       write
> ----------------------------------------------------------------
> mercurial  1      49,402,813     5,963,475   0.170159  53.250304
> mercurial  6      47,197,397     5,875,730   0.182820  56.264320
> mercurial  9      47,121,596     5,849,781   0.189219  56.293612
> 
> pypy       1     370,830,572    28,462,425   2.679217 460.721984
> pypy       6     340,112,317    27,648,747   2.768691 467.537158
> pypy       9     338,360,736    27,639,003   2.763495 476.589918
> 
> netbeans   1   1,281,847,810   165,495,457 122.477027 520.560316
> netbeans   6   1,205,284,353   159,161,207 139.876147 715.930400
> netbeans   9   1,197,135,671   155,034,586 141.620281 678.297064
> 
> mozilla    1   2,775,497,186   298,527,987 147.867662 751.263721
> mozilla    6   2,596,856,420   286,597,671 170.572118 987.056093
> mozilla    9   2,587,542,494   287,018,264 163.622338 739.803002
...
> diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
> --- a/mercurial/help/config.txt
> +++ b/mercurial/help/config.txt
> @@ -1881,6 +1881,11 @@ category impact performance and reposito
>      This option is enabled by default. When disabled, it also disables the
>      related ``storage.revlog.reuse-external-delta-parent`` option.
>  
> +``revlog.zlib.level``
> +    Zlib compression level used when storing data into the repository. 
> Accepted
> +    Value range from 1 (lowest compression) to 9 (highest compression). Zlib
> +    default value is 6.

I know this is very unlikely to change, but does it make sense to say what
an external libarary's defaults are?


Thanks for doing this! :)

Jeff.

-- 
Reality is merely an illusion, albeit a very persistent one.
                - Albert Einstein
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to