After more iteration (6 in my environment) the rss usage stabilized in 759128KB.
This is a really simplified test, in our real environment we run about 3000 containers and with lots other operations, like set route, loadbalancer, all the ovn-sb operations etc. The memory consumption can quickly go up to 6GB (nb and sb together) and lead a system OOM. Is that a reasonable resource consumption in your experience? I didn't remember the actual numbers of standalone db resource consumption, however in the same environment, it didn't lead to an OOM. Han Zhou <[email protected]> 于2019年12月16日周一 下午1:05写道: > Thanks for the details. I tried the same command with a for loop. > > After the first 4 iterations, the RSS of the first NB server increased to > 572888 (KB). After that, it stayed the same in the next 3 iterations. So it > seems to just build memory buffers up and then stayed at the level without > further increasing and doesn't seem to be memory leaking. Could you try > more iterations and see if it still continuously increase? > > Thanks, > Han > > On Sun, Dec 15, 2019 at 7:54 PM 刘梦馨 <[email protected]> wrote: > > > > Hi, Han > > > > In my test scenario, I use ovn-ctl to start a one node ovn with cluster > mode db and no chassis bind to the ovn-sb to just check the memory usage of > ovn-nb. > > Then use a script to add a logical switch, add 1000 ports, set dynamic > addresses and then delete the logical switch. > > > > #!/bin/bash > > ovn-nbctl ls-add ls1 > > for i in {1..1000}; do > > ovn-nbctl lsp-add ls1 ls1-vm$i > > ovn-nbctl lsp-set-addresses ls1-vm$i dynamic > > done > > ovn-nbctl ls-del ls1 > > > > I run this script repeatedly and watch the memory change. > > > > After 5 runs (5000 lsp add and delete), the rss of nb increased to 667M. > > The nb file increased to 119M and didn't automatically compacted. After > a manually compact the db file size change back to 11K, but the memory > usage didn't change. > > > > > > > > Han Zhou <[email protected]> 于2019年12月14日周六 上午3:40写道: > >> > >> > >> > >> On Wed, Dec 11, 2019 at 12:51 AM 刘梦馨 <[email protected]> wrote: > >> > > >> > > >> > We are using ovs/ovn 2.12.0 to implementing our container network. > After switching form standalone ovndb to cluster mode ovndb, we noticed > that the memory consumption for both ovnnb and ovnsb will keep increasing > after each operation and never decrease. > >> > > >> > We did some profiling by valgrind. The leak check report a 16 byte > leak in fork_and_wait_for_startup, which obviously is not the main reason. > Later we use memif to profile the memory consumption and we put the result > in the attachment. > >> > > >> > Most of the memory come from two part ovsthread_wrapper > (ovs-thread.c:378) that allocates a subprogram_name and jsonrpc_send > (jsonrpc.c:253) as below, (I just skipped the duplicated stack of jsonrpc). > >> > > >> > However I found both part have a related free operation in near > place, so I don't know how to further explore this memory issue. I'm not > aware of the differences here between cluster mode and standalone mode. > >> > > >> > Can anyone give some advice and hint? Thanks! > >> > > >> > 100.00% (357,920,768B) (page allocation syscalls) mmap/mremap/brk, > --alloc-fns, etc. > >> > ->78.52% (281,038,848B) 0x66FDD49: mmap (in /usr/lib64/libc-2.17.so) > >> > | ->37.50% (134,217,728B) 0x66841EF: new_heap (in /usr/lib64/ > libc-2.17.so) > >> > | | ->37.50% (134,217,728B) 0x6684C22: arena_get2.isra.3 (in > /usr/lib64/libc-2.17.so) > >> > | | ->37.50% (134,217,728B) 0x668AACC: malloc (in /usr/lib64/ > libc-2.17.so) > >> > | | ->37.50% (134,217,728B) 0x4FDC613: xmalloc (util.c:138) > >> > | | ->37.50% (134,217,728B) 0x4FDC78E: xvasprintf (util.c:202) > >> > | | ->37.50% (134,217,728B) 0x4FDC877: xasprintf (util.c:343) > >> > | | ->37.50% (134,217,728B) 0x4FA548D: ovsthread_wrapper > (ovs-thread.c:378) > >> > | | ->37.50% (134,217,728B) 0x5BE5E63: start_thread (in > /usr/lib64/libpthread-2.17.so) > >> > | | ->37.50% (134,217,728B) 0x670388B: clone (in > /usr/lib64/libc-2.17.so) > >> > | | > >> > | ->36.33% (130,023,424B) 0x6686DF3: sysmalloc (in /usr/lib64/ > libc-2.17.so) > >> > | | ->36.33% (130,023,424B) 0x6687CA8: _int_malloc (in /usr/lib64/ > libc-2.17.so) > >> > | | ->28.42% (101,711,872B) 0x66890C0: _int_realloc (in /usr/lib64/ > libc-2.17.so) > >> > | | | ->28.42% (101,711,872B) 0x668B160: realloc (in /usr/lib64/ > libc-2.17.so) > >> > | | | ->28.42% (101,711,872B) 0x4FDC9A3: xrealloc (util.c:149) > >> > | | | ->28.42% (101,711,872B) 0x4F1DEB2: ds_reserve > (dynamic-string.c:63) > >> > | | | ->28.42% (101,711,872B) 0x4F1DED3: ds_put_uninit > (dynamic-string.c:73) > >> > | | | ->28.42% (101,711,872B) 0x4F1DF0B: ds_put_char__ > (dynamic-string.c:82) > >> > | | | ->26.37% (94,371,840B) 0x4F2B09F: > json_serialize_string (dynamic-string.h:93) > >> > | | | | ->12.01% (42,991,616B) 0x4F2B3EA: json_serialize > (json.c:1651) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B3EA: > json_serialize (json.c:1651) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B3EA: > json_serialize (json.c:1651) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B540: > json_serialize (json.c:1626) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B540: > json_serialize (json.c:1626) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B540: > json_serialize (json.c:1626) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B540: > json_serialize (json.c:1626) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B3EA: > json_serialize (json.c:1651) > >> > | | | | | ->12.01% (42,991,616B) 0x4F2B540: > json_serialize (json.c:1626) > >> > | | | | | ->12.01% (42,991,616B) > 0x4F2D82A: json_to_ds (json.c:1525) > >> > | | | | | ->12.01% (42,991,616B) > 0x4F2EA49: jsonrpc_send (jsonrpc.c:253) > >> > | | | | | ->12.01% (42,991,616B) > 0x4C3A68A: ovsdb_jsonrpc_server_run (jsonrpc-server.c:1104) > >> > | | | | | ->12.01% (42,991,616B) > 0x10DCC1: main (ovsdb-server.c:209) > >> > > >> > _______________________________________________ > >> > discuss mailing list > >> > [email protected] > >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > >> > >> Thanks for reporting the issue. Could you describe your test scenario > (the operations), the scale, the db file size and the memory (RSS) data of > the NB/SB? > >> Clustered mode maintains some extra data such as RAFT logs, compares to > standalone, but it should not increase forever, because RAFT logs will get > compacted periodically. > >> > >> Thanks, > >> Han > > > > > > > > -- > > 刘梦馨 > > Blog: http://oilbeater.com > > Weibo: @oilbeater > -- 刘梦馨 Blog: http://oilbeater.com Weibo: @oilbeater <http://weibo.com/oilbeater>
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
