The short answer is that you aren't supposed to store large things in
xattrs at all.  If you feel it's a "vulnerability", than we could add
a config option to reject xattrs over a particular size.

> Hi Cephers,
> implementing compression support for EC pools I faced an issue that can be
> summarized as follows.
> Imagine a client that continuously extends specific object xattr by doing
> complete attribute rewrite with new data portion appended.
> As a result one can observe permanently increasing mem usage for ceph-osd
> processes. This happens for objects at EC pools only.
> I briefly investigated for the root cause and it looks like that's due to PG
> log memory consumption growth. PG log entry count is pretty stable but each
> entry consumes more and more memory over the time since it contains full
> attribute value.
> As far as I understand replicated pools do not log setattr operation (
> actually mark it as unrollbackable ) that's why the issue isn't observed
> there.
> With 3000 log entries and e.g. 64Kb attribute value memory consumption is
> pretty visible.
> So the questions are:
> * Are there any ideas how to resolve this issue? Obvious solution is to
> refactor attribute extending by using multiple keys...  Anything else?
> * Does it make sense to resolve it at all?  IMO that's a sort of
> vulnerability for Ceph process to behave this way...
> Please find a python script to reproduce the issue below, to be started from
> the folder where ceph.conf is located:
> python <poolname>
> ######################################
> import rados, sys
> from time import sleep
> import psutil
> def print_process_mem_usage(pid):
>   process = psutil.Process(pid)
>   mem = process.get_memory_info()
>   mem0=mem[0] / (2 ** 20)
>   mem1=mem[1] / (2 ** 20)
>   print "pid %d: Virt: %i MB, Res: %i MB" % (pid, mem1, mem0)
> def print_processes_mem_usage():
>   for proc in psutil.process_iter():
>     try:
>       if 'ceph-osd' in
>         print_process_mem_usage(
>     except psutil.NoSuchProcess:
>       pass
> cluster = rados.Rados(conffile='./ceph.conf')
> cluster.connect()
> ioctx = cluster.open_ioctx(sys.argv[1])
> try:
>     ioctx.remove_object("pyobject")
> except:
>     pass
> s=""
> for i in range(25000):
>     s=''.zfill( i*15)
>     ioctx.set_xattr( 'pyobject', 'somekey', s)
>     if (i % 500)==0:
>         print '%d-th step, attr len = %d' % (i, len(s))
>         print_processes_mem_usage()
> ioctx.close()
> #########################
> Sample output is as below:
> 0-th step, attr len = 0
> pid 23723: Virt: 700 MB, Res: 30 MB
> pid 23922: Virt: 701 MB, Res: 32 MB
> pid 24142: Virt: 700 MB, Res: 32 MB
> ...
> 4000-th step, attr len = 60000
> pid 23723: Virt: 896 MB, Res: 207 MB
> pid 23922: Virt: 900 MB, Res: 212 MB
> pid 24142: Virt: 897 MB, Res: 210 MB
> ...
> 6000-th step, attr len = 90000
> pid 23723: Virt: 1025 MB, Res: 331 MB
> pid 23922: Virt: 1032 MB, Res: 338 MB
> pid 24142: Virt: 1025 MB, Res: 333 MB
> ...
