My apologize, the patch has some issue. I need to dig further. Yifeng
On Tue, Jul 24, 2018 at 1:40 PM, Yifeng Sun <pkusunyif...@gmail.com> wrote: > Hi Yun and Girish, > > I submitted a patch, do you mind testing and reviewing it? Thanks. > > [PATCH] dynamic-string: Fix a bug that leads to assertion fail > > diff --git a/lib/dynamic-string.c b/lib/dynamic-string.c > index 6f7b610a9908..4564e420544d 100644 > --- a/lib/dynamic-string.c > +++ b/lib/dynamic-string.c > @@ -158,7 +158,7 @@ ds_put_format_valist(struct ds *ds, const char > *format, va_list args_) > if (needed < available) { > ds->length += needed; > } else { > - ds_reserve(ds, ds->length + needed); > + ds_reserve(ds, ds->allocated + needed); > > va_copy(args, args_); > available = ds->allocated - ds->length + 1; > > > Thanks, > Yifeng Sun > > On Wed, Jul 18, 2018 at 10:48 AM, Girish Moodalbail <gmoodalb...@gmail.com > > wrote: > >> Hello all, >> >> We are able to reproduce this issue on OVS 2.9.2 at will. The OVSDB NB >> server or OVSDB SB server dumps core while it is trying to compact the >> database. >> >> You can reproduce the issue by using: >> >> root@u1804-HVM-domU:/var/crash# ovs-appctl -t >> /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/compact OVN_Southbound >> >> 2018-07-18T17:34:29Z|00001|unixctl|WARN|error communicating with >> unix:/var/run/openvswitch/ovnsb_db.ctl: End of file >> ovs-appctl: /var/run/openvswitch/ovnsb_db.ctl: transaction error (End of >> file) >> root@u1804-HVM-domU:/var/crash# >> root@u1804-HVM-domU:/var/crash# >> root@u1804-HVM-domU:/var/crash# ERROR: apport (pid 17393) Wed Jul 18 >> 10:34:23 2018: called for pid 14683, signal 6, core limit 0, dump mode 1 >> ERROR: apport (pid 17393) Wed Jul 18 10:34:23 2018: executable: >> /usr/sbin/ovsdb-server (command line "ovsdb-server -vconsole:off >> -vfile:info --log-file=/var/log/openvswitch/ovsdb-server-sb.log >> --remote=punix:/var/run/openvswitch/ovnsb_db.sock >> --pidfile=/var/run/openvswitch/ovnsb_db.pid --unixctl=ovnsb_db.ctl >> --detach >> --monitor --remote=db:OVN_Southbound,SB_Global,connections >> --private-key=db:OVN_Southbound,SSL,private_key >> --certificate=db:OVN_Southbound,SSL,certificate >> --ca-cert=db:OVN_Southbound,SSL,ca_cert >> --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols >> --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers >> --remote=ptcp:6642:10.0.7.33 /etc/openvswitch/ovnsb_db.db") >> ERROR: apport (pid 17393) Wed Jul 18 10:34:23 2018: is_closing_session(): >> no DBUS_SESSION_BUS_ADDRESS in environment >> ERROR: apport (pid 17393) Wed Jul 18 10:34:29 2018: wrote report >> /var/crash/_usr_sbin_ovsdb-server.0.crash >> >> Looking through the crash we see the following stack: >> >> (gdb) bt >> #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 >> #1 0x00007f7c9a43c801 in __GI_abort () at abort.c:79 >> #2 0x00007f7c9aaa633c in json_serialize (json=<optimized out>, >> s=<optimized out>) at lib/json.c:1554 >> #3 0x00007f7c9aaa63ab in json_serialize_object_member (i=<optimized out>, >> s=<optimized out>, node=<optimized out>, node=<optimized out>) >> at lib/json.c:1583 >> #4 0x00007f7c9aaa62f2 in json_serialize_object (s=0x7ffca2173ea0, >> object=0x5568dc5d5b10) at lib/json.c:1612 >> #5 json_serialize (json=<optimized out>, s=0x7ffca2173ea0) at >> lib/json.c:1533 >> #6 0x00007f7c9aaa863c in json_to_ds (json=json@entry=0x5568dc5d4a20, >> flags=flags@entry=0, ds=ds@entry=0x7ffca2173f30) at lib/json.c:1511 >> #7 0x00007f7c9ae6750f in ovsdb_log_compose_record >> (json=json@entry=0x5568dc5d4a20, >> magic=0x5568dc5d5a60 "CLUSTER", >> header=header@entry=0x7ffca2173f10, data=data@entry=0x7ffca2173f30) >> at >> ovsdb/log.c:570 >> #8 0x00007f7c9ae677ef in ovsdb_log_write (file=0x5568dc5d5a80, >> json=0x5568dc5d4a20) at ovsdb/log.c:618 >> #9 0x00007f7c9ae6796e in ovsdb_log_write_and_free >> (log=log@entry=0x5568dc5d5a80, >> json=0x5568dc5d4a20) at ovsdb/log.c:651 >> #10 0x00007f7c9ae6d684 in raft_write_snapshot (raft=raft@entry >> =0x5568dc1e3720, >> log=0x5568dc5d5a80, new_log_start=new_log_start@entry=539578, >> new_snapshot=new_snapshot@entry=0x7ffca21740e0) at ovsdb/raft.c:3588 >> #11 0x00007f7c9ae6dbf3 in raft_save_snapshot (raft=raft@entry >> =0x5568dc1e3720, >> new_start=new_start@entry=539578, >> new_snapshot=new_snapshot@entry=0x7ffca21740e0) at ovsdb/raft.c:3647 >> #12 0x00007f7c9ae757bd in raft_store_snapshot (raft=0x5568dc1e3720, >> new_snapshot_data=new_snapshot_data@entry=0x5568dc5d49a0) >> at ovsdb/raft.c:3849 >> #13 0x00007f7c9ae7c7ae in ovsdb_storage_store_snapshot__ >> (storage=0x5568dc6b2fb0, schema=0x5568dd66f5a0, data=0x5568dca67880) >> at ovsdb/storage.c:541 >> #14 0x00007f7c9ae7d1de in ovsdb_storage_store_snapshot >> (storage=0x5568dc6b2fb0, schema=schema@entry=0x5568dd66f5a0, >> data=data@entry=0x5568dca67880) at ovsdb/storage.c:568 >> #15 0x00007f7c9ae69cab in ovsdb_snapshot (db=0x5568dc6b3020) at >> ovsdb/ovsdb.c:519 >> #16 0x00005568daec1f82 in main_loop (is_backup=0x7ffca21742be, >> exiting=0x7ffca21742bf, run_process=0x0, remotes=0x7ffca2174310, >> unixctl=0x5568dc71ade0, all_dbs=0x7ffca2174350, >> jsonrpc=0x5568dc1e36a0, >> config=0x7ffca2174370) at ovsdb/ovsdb-server.c:239 >> #17 main (argc=<optimized out>, argv=<optimized out>) at >> ovsdb/ovsdb-server.c:457 >> >> Walking through the JSON objects being serialized we see that >> "prev_servers" is malformed. >> >> (gdb) print *((struct shash *)0x5568dc5d5b10) >> $3 = { >> map = { >> buckets = 0x5568dc5d1d30, >> one = 0x0, >> mask = 7, >> n = 9 >> } >> } >> >> (gdb) x/6a 0x5568dc5d1d30 >> 0x5568dc5d1d30: 0x5568dc5d6000 0x0 >> 0x5568dc5d1d40: 0x0 0x5568dc5d5f30 >> 0x5568dc5d1d50: 0x5568dc5d5e30 0x5568dc5d5bc0 >> >> Let us look at the next one >> >> (gdb) print *((struct shash_node *)0x5568dc5d5e30) >> $7 = { >> node = { >> hash = 2043875868, >> next = 0x0 >> }, >> name = 0x5568dc5d5e10 "prev_servers", >> data = 0x5568dc688cd0 >> } >> >> (gdb) print *((struct json *)0x5568dc688cd0) >> $10 = { >> type = 3697839232, >> count = 34, >> u = { >> object = 0x5568dc688cb0, >> array = { >> n = 93908862799024, >> n_allocated = 93908862798944, >> elems = 0x5568dc22f050 >> }, >> integer = 93908862799024, >> real = 4.6397142949016804e-310, >> string = 0x5568dc688cb0 "\a" >> } >> } >> >> So, this is malformed. Somehow "prev_servers" is getting malformed. >> >> That information is coming in from 'struct raft`snap`servers' >> >> As anyone seen this before? >> >> >> On Fri, Jul 13, 2018 at 3:49 PM, Yun Zhou <y...@nvidia.com> wrote: >> >> > Hi, >> > >> > We are running into some issues while we are trying out the 3 nodes raft >> > ovsdb cluster in our lab, and hopefully we can get some help from the >> > community. >> > >> > We are using ovs 2.9.2. >> > ------------------------- >> > >> > We found that on one of the 3 nodes, the SB ovsdb-server was not >> started, >> > and was not able to be restarted because its database was already >> corrupted: >> > >> > "ovsdb-server: syntax "{"encaps":["uuid","7f0f7605- >> > c1d1-43fb-826a-1718ea70e088"],"hostname":"nd-sdn-dgx-010"}": syntax >> > error: hostname is not a UUID" >> > >> > Seeing from the ovsdb-server-sb log file history, SB ovsdb-server core >> > dumped several days ago: >> > >> > "2018-07-08T06:58:15.267Z|00002|daemon_unix(monitor)|ERR|1 >> > crashes: pid 937 died, killed (Aborted), core dumped, restarting" >> > >> > Unfortunately, core dump was not generated. >> > >> > FWIW, we saw core dumps for the NB ovsdb on all 3 cluster nodes, here is >> > one of the stack: >> > >> > (gdb) bt >> > #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/rai >> se.c:51 >> > #1 0x00007fc48f8c2801 in __GI_abort () at abort.c:79 >> > #2 0x00007fc48ff2c33c in ?? () from /usr/lib/x86_64-linux-gnu/ >> > libopenvswitch-2.9.so.0 >> > #3 0x00007fc48ff2c2f2 in ?? () from /usr/lib/x86_64-linux-gnu/ >> > libopenvswitch-2.9.so.0 >> > #4 0x00007fc48ff2e63c in json_to_ds () >> > from /usr/lib/x86_64-linux-gnu/libopenvswitch-2.9.so.0 >> > #5 0x00007fc4902ed50f in ovsdb_log_compose_record () >> > from /usr/lib/x86_64-linux-gnu/libovsdb-2.9.so.0 >> > #6 0x00007fc4902ed7ef in ovsdb_log_write () >> > from /usr/lib/x86_64-linux-gnu/libovsdb-2.9.so.0 >> > #7 0x00007fc4902ed96e in ovsdb_log_write_and_free () >> > from /usr/lib/x86_64-linux-gnu/libovsdb-2.9.so.0 >> > #8 0x00007fc4902f3684 in ?? () from /usr/lib/x86_64-linux-gnu/ >> > libovsdb-2.9.so.0 >> > #9 0x00007fc4902f3bf3 in ?? () from /usr/lib/x86_64-linux-gnu/ >> > libovsdb-2.9.so.0 >> > #10 0x00007fc4902fb7bd in raft_store_snapshot () >> > from /usr/lib/x86_64-linux-gnu/libovsdb-2.9.so.0 >> > #11 0x00007fc4903027ae in ?? () from /usr/lib/x86_64-linux-gnu/ >> > libovsdb-2.9.so.0 >> > #12 0x00007fc4903031de in ovsdb_storage_store_snapshot () >> > from /usr/lib/x86_64-linux-gnu/libovsdb-2.9.so.0 >> > #13 0x00007fc4902efcab in ovsdb_snapshot () >> > from /usr/lib/x86_64-linux-gnu/libovsdb-2.9.so.0 >> > #14 0x0000561e47a8cf82 in ?? () >> > #15 0x00007fc48f8a3b97 in __libc_start_main (main=0x561e47a8bef0, >> argc=17, >> > argv=0x7ffe000ce2c8, init=<optimized out>, fini=<optimized out>, >> > rtld_fini=<optimized out>, stack_end=0x7ffe000ce2b8) at >> > ../csu/libc-start.c:310 >> > #16 0x0000561e47a8db9a in ?? () >> > >> > Please let us know if any more information is needed. Thanks very much! >> > >> > - Yun >> > >> > >> > ------------------------------------------------------------ >> > ----------------------- >> > This email message is for the sole use of the intended recipient(s) and >> > may contain >> > confidential information. Any unauthorized review, use, disclosure or >> > distribution >> > is prohibited. If you are not the intended recipient, please contact >> the >> > sender by >> > reply email and destroy all copies of the original message. >> > ------------------------------------------------------------ >> > ----------------------- >> > _______________________________________________ >> > discuss mailing list >> > disc...@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > >> _______________________________________________ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev