On (04/26/17 09:52), js1...@gmail.com wrote:
[..]
> +struct zram_hash {
> +     spinlock_t lock;
> +     struct rb_root rb_root;
>  };

just a note.

we can easily have N CPUs spinning on ->lock for __zram_dedup_get() lookup,
which can invole a potentially slow zcomp_decompress() [zlib, for example,
with 64k pages] and memcmp(). the larger PAGE_SHIFT is, the more serialized
IOs become. in theory, at least.

CPU0                            CPU1            ...     CPUN

__zram_bvec_write()     __zram_bvec_write()             __zram_bvec_write()
 zram_dedup_find()       zram_dedup_find()               zram_dedup_find()
  spin_lock(&hash->lock);
                          spin_lock(&hash->lock);         
spin_lock(&hash->lock);
   __zram_dedup_get()
    zcomp_decompress()
     ...


so may be there is a way to use read-write lock instead on spinlock for hash
and reduce write/read IO serialization.

        -ss

Reply via email to