Re: Suspected mbuf leak with Nginx + sendfile + TLS in 12.2-STABLE
On Fri, Feb 05, 2021 at 11:54:07AM +0100, GomoR wrote: > On 2021-02-05 09:11, GomoR wrote: > >> The first step I would do if possible would be to bisect between the > >> last > >> known working version and the version that is known to be broken to > >> determine which commit introduced the problem. One thing that could > >> help > >> here is to see if you can reproduce the problem using a 12.2 kernel on > >> a > >> 12.1 world + ports. If you can, then you can limit your bisecting to > >> just > >> building new kernels which will make that process quicker. > > We have reinstalled from scratch our system with FreeBSD 12.1-RELEASE. > We then > have installed just enough of our software stack to reproduce the issue. > > No problem with a stock 12.1-RELEASE kernel, but problem arise after > installkernel > with the latest 12.2-STABLE. We then turned off all our customizations, > including > some specific sysctl.conf values. The bug didn't triggered. > > After dissecting our sysctl values, the faulty one has been identified: > > kern.ipc.maxsockbuf=157286400 > > This value is 75 times the default value (2097152). Restoring the > default value > fixes the issue. After some tests, the bug is triggered starting > somewhere to > 64 times the default value (134217728). > > There was no issue with this setting in 12.1-RELEASE, but there is in > 12.2-RELEASE. > > Do you have some insights onto why it causes that mbuf problems? In the > meantime, > we have our solution, but we are willing to help identify if that's a > kernel bug > or just a real bad idea to set maxsockbuf to such a high value. === > Each time a user downloads a file, mbuf & mbuf_clusters are raising to > reach the maximum limit in a matter of seconds. Those values are > asserted by 'netstat -m' as follows: > > Normal situation: > > mbuf: 256, 26031105, 16767,5974,428087938, 0, 0 > mbuf_cluster: 2048, 8135232, 18408,2704,101644203, 0, 0 > > Warning situtation: > > mbuf: 256, 26031105, 2981516, 151205,1109483561, 0, 0 > mbuf_cluster: 2048, 8135232, 2983155,4201,319714617, 0, 0 === Can you clarified what is problem? I.e. under load system used more resources and this is not bug. Do you see more resources usage compared to load? Or resources don't freed after drop load? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ses device over T-SGPIO
On Fri, Jan 29, 2021 at 10:12:03AM -0700, Alan Somers wrote: > What does "camcontrol devlist" show? Only 2 disk: usb-flash and da1 (isci connected) (I am currently just boot from 12.2 install) > On Fri, Jan 29, 2021 at 10:06 AM Slawa Olhovchenkov wrote: > > > On Fri, Jan 29, 2021 at 09:51:33AM -0700, Alan Somers wrote: > > > > > I've never used any tool with SGPIO. The hardware simply isn't powerful > > > enough to be useful. sesutil works, in theory, to control the LEDs. But > > > it's of limited usefulness since there's no way to tell which drives are > > > installed in which slots. > > > > For me sesutil failed w/ "No SES device found" > > > > > > > > On Fri, Jan 29, 2021 at 9:44 AM Slawa Olhovchenkov > > wrote: > > > > > > > On Fri, Jan 29, 2021 at 09:22:47AM -0700, Alan Somers wrote: > > > > > > > > > The short story is: SGPIO sucks. It doesn't detect drive presence, > > much > > > > > less provide physical path information. The only thing you can do > > with > > > > it > > > > > is control the fault LEDs. But doing that usefully requires you to > > have > > > > > some extra source of information about what drives are installed in > > what > > > > > slots. Basically, you need to track that kind of information > > offline. > > > > > sesutil ought to be able to control the LEDs, at least, but I've > > never > > > > > personally used it with SGPIO. > > > > > > > > What tool you used with SGPIO? > > > > What additional drivers need? > > > > > > > > > On Fri, Jan 29, 2021 at 9:02 AM Slawa Olhovchenkov > > > > wrote: > > > > > > > > > > > I am have Supermicro MB X9DBU-iF connected to bcakplane > > BPN-SAS-825TQ > > > > > > by T-SGPIO cables. sesutil don't found any SES device. > > > > > > > > > > > > Is this posible to have control to this backplane? > > > > > > ___ > > > > > > freebsd-stable@freebsd.org mailing list > > > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > > > > To unsubscribe, send any mail to " > > > > freebsd-stable-unsubscr...@freebsd.org" > > > > > > > > > > > ___ > > > > > freebsd-stable@freebsd.org mailing list > > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > > > To unsubscribe, send any mail to " > > freebsd-stable-unsubscr...@freebsd.org > > > > " > > > > > > > ___ > > > freebsd-stable@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org > > " > > > ___ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ses device over T-SGPIO
On Fri, Jan 29, 2021 at 09:51:33AM -0700, Alan Somers wrote: > I've never used any tool with SGPIO. The hardware simply isn't powerful > enough to be useful. sesutil works, in theory, to control the LEDs. But > it's of limited usefulness since there's no way to tell which drives are > installed in which slots. For me sesutil failed w/ "No SES device found" > > On Fri, Jan 29, 2021 at 9:44 AM Slawa Olhovchenkov wrote: > > > On Fri, Jan 29, 2021 at 09:22:47AM -0700, Alan Somers wrote: > > > > > The short story is: SGPIO sucks. It doesn't detect drive presence, much > > > less provide physical path information. The only thing you can do with > > it > > > is control the fault LEDs. But doing that usefully requires you to have > > > some extra source of information about what drives are installed in what > > > slots. Basically, you need to track that kind of information offline. > > > sesutil ought to be able to control the LEDs, at least, but I've never > > > personally used it with SGPIO. > > > > What tool you used with SGPIO? > > What additional drivers need? > > > > > On Fri, Jan 29, 2021 at 9:02 AM Slawa Olhovchenkov > > wrote: > > > > > > > I am have Supermicro MB X9DBU-iF connected to bcakplane BPN-SAS-825TQ > > > > by T-SGPIO cables. sesutil don't found any SES device. > > > > > > > > Is this posible to have control to this backplane? > > > > ___ > > > > freebsd-stable@freebsd.org mailing list > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > > To unsubscribe, send any mail to " > > freebsd-stable-unsubscr...@freebsd.org" > > > > > > > ___ > > > freebsd-stable@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org > > " > > > ___ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ses device over T-SGPIO
On Fri, Jan 29, 2021 at 09:22:47AM -0700, Alan Somers wrote: > The short story is: SGPIO sucks. It doesn't detect drive presence, much > less provide physical path information. The only thing you can do with it > is control the fault LEDs. But doing that usefully requires you to have > some extra source of information about what drives are installed in what > slots. Basically, you need to track that kind of information offline. > sesutil ought to be able to control the LEDs, at least, but I've never > personally used it with SGPIO. What tool you used with SGPIO? What additional drivers need? > On Fri, Jan 29, 2021 at 9:02 AM Slawa Olhovchenkov wrote: > > > I am have Supermicro MB X9DBU-iF connected to bcakplane BPN-SAS-825TQ > > by T-SGPIO cables. sesutil don't found any SES device. > > > > Is this posible to have control to this backplane? > > ___ > > freebsd-stable@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > > ___ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ses device over T-SGPIO
I am have Supermicro MB X9DBU-iF connected to bcakplane BPN-SAS-825TQ by T-SGPIO cables. sesutil don't found any SES device. Is this posible to have control to this backplane? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ncurses in 12-stable break emacs tramp mode
Before ncurses update emcas tramp mode got next echo string _echo^H ^H^H ^H^H ^H^H ^H^H ^Hstty after ncurses update echo string is different: _echo^M#$ _ech ^H^M#$ _ec ^H^M#$ _e ^H^M#$ _ ^H^M#$ ^Hstty icanon erase ^H cols 32767_echo i.e. ncurses on `dumb` terminal still do refresh all line, from begin of string, include prompt. This is complety break emacs tramp mode to FreeBSD host. Is this posible to fix this? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ignore broken SATA disk
On Sun, Apr 12, 2020 at 07:08:06PM +0200, Stefan Bethke wrote: > Am 12.04.2020 um 19:03 schrieb Slawa Olhovchenkov : > > > > On Sun, Apr 12, 2020 at 06:38:10PM +0200, Stefan Bethke wrote: > > > >> > >> > >>> Am 12.04.2020 um 18:31 schrieb Slawa Olhovchenkov : > >>> > >>> On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote: > >>> > >>>> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov : > >>>>> > >>>>> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote: > >>>>> > >>>>>> I have a server I don't have physical access to right now, which has a > >>>>>> broken SATA disk that produces mostly errors (but not entirely). > >>>>>> > >>>>>> The disk has two partitions that are part of a zpool each. I can't > >>>>>> bring the system up with this disk being online, because ZFS is trying > >>>>>> its darndest to use it. > >>>>>> > >>>>>> I already renamed the GPT partitions in the hope that ZFS would not > >>>>>> find them anymore, but it does. > >>>>>> > >>>>>> I can't gpart destroy -f ada1 because "device busy". > >>>>>> > >>>>>> Is there a way, ideally in the loader, to tell the kernel to ignore > >>>>>> ada1 and/or ahcich5? Or can I force ZFS some other way to ignore the > >>>>>> disk? I do have a spare disk I can use to replace the failed one, but > >>>>>> I can't get the machine into a state where I could even issue the > >>>>>> zpool replace command. > >>>>> > >>>>> `zpool offline pool device` if you have enoght redundancy? > >>>> > >>>> I do, but the command doesn't return. Instead, I'm getting loads of sata > >>>> error message. > >>> > >>> What you zpool configuration? > >> > >> This is from the working system. The identifiers are slightly different, > >> but the structure is identical. > > > > what about `zpool detach ` ? > > Now I can't boot into single user mode anymore, ZFS just waits forever, and > the kernel is printing an endless chain of SATA error messages. > > I really need a way to remove the broken disk before ZFS tries to access it, > or a way to stop ZFS from try to access the disk. This disk only part of mirror? ZIL is OK? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ignore broken SATA disk
On Sun, Apr 12, 2020 at 06:38:10PM +0200, Stefan Bethke wrote: > > > > Am 12.04.2020 um 18:31 schrieb Slawa Olhovchenkov : > > > > On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote: > > > >> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov : > >>> > >>> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote: > >>> > >>>> I have a server I don't have physical access to right now, which has a > >>>> broken SATA disk that produces mostly errors (but not entirely). > >>>> > >>>> The disk has two partitions that are part of a zpool each. I can't bring > >>>> the system up with this disk being online, because ZFS is trying its > >>>> darndest to use it. > >>>> > >>>> I already renamed the GPT partitions in the hope that ZFS would not find > >>>> them anymore, but it does. > >>>> > >>>> I can't gpart destroy -f ada1 because "device busy". > >>>> > >>>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 > >>>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I > >>>> do have a spare disk I can use to replace the failed one, but I can't > >>>> get the machine into a state where I could even issue the zpool replace > >>>> command. > >>> > >>> `zpool offline pool device` if you have enoght redundancy? > >> > >> I do, but the command doesn't return. Instead, I'm getting loads of sata > >> error message. > > > > What you zpool configuration? > > This is from the working system. The identifiers are slightly different, but > the structure is identical. what about `zpool detach ` ? > # zpool status > pool: data > state: ONLINE > status: Some supported features are not enabled on the pool. The pool can > still be used, but some features are unavailable. > action: Enable all features using 'zpool upgrade'. Once this is done, > the pool may no longer be accessible by software that does not support > the features. See zpool-features(7) for details. > scan: resilvered 176K in 0 days 00:01:28 with 0 errors on Sun May 26 > 21:24:54 2019 > config: > > NAME STATE READ WRITE CKSUM > data ONLINE 0 0 0 > mirror-0ONLINE 0 0 0 > gpt/ls0data ONLINE 0 0 0 > gpt/ls1data ONLINE 0 0 0 > logs > gpt/data0logONLINE 0 0 0 > cache > gpt/data0cache ONLINE 0 0 0 > > errors: No known data errors > > pool: ls-host > state: ONLINE > status: Some supported features are not enabled on the pool. The pool can > still be used, but some features are unavailable. > action: Enable all features using 'zpool upgrade'. Once this is done, > the pool may no longer be accessible by software that does not support > the features. See zpool-features(7) for details. > scan: scrub repaired 0 in 0 days 00:06:33 with 0 errors on Sun Apr 12 > 11:46:25 2020 > config: > > NAME STATE READ WRITE CKSUM > ls-host ONLINE 0 0 0 > mirror-0ONLINE 0 0 0 > gpt/ls0host ONLINE 0 0 0 > gpt/ls1host ONLINE 0 0 0 > logs > gpt/host0logONLINE 0 0 0 > cache > gpt/host0cache ONLINE 0 0 0 > > errors: No known data errors > > > -- > Stefan BethkeFon +49 151 14070811 > ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ignore broken SATA disk
On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote: > Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov : > > > > On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote: > > > >> I have a server I don't have physical access to right now, which has a > >> broken SATA disk that produces mostly errors (but not entirely). > >> > >> The disk has two partitions that are part of a zpool each. I can't bring > >> the system up with this disk being online, because ZFS is trying its > >> darndest to use it. > >> > >> I already renamed the GPT partitions in the hope that ZFS would not find > >> them anymore, but it does. > >> > >> I can't gpart destroy -f ada1 because "device busy". > >> > >> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 > >> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do > >> have a spare disk I can use to replace the failed one, but I can't get the > >> machine into a state where I could even issue the zpool replace command. > > > > `zpool offline pool device` if you have enoght redundancy? > > I do, but the command doesn't return. Instead, I'm getting loads of sata > error message. What you zpool configuration? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: make kernel ignore broken SATA disk
On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote: > I have a server I don't have physical access to right now, which has a broken > SATA disk that produces mostly errors (but not entirely). > > The disk has two partitions that are part of a zpool each. I can't bring the > system up with this disk being online, because ZFS is trying its darndest to > use it. > > I already renamed the GPT partitions in the hope that ZFS would not find them > anymore, but it does. > > I can't gpart destroy -f ada1 because "device busy". > > Is there a way, ideally in the loader, to tell the kernel to ignore ada1 > and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do > have a spare disk I can use to replace the failed one, but I can't get the > machine into a state where I could even issue the zpool replace command. `zpool offline pool device` if you have enoght redundancy? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Access to NETMAP from c++ program
On Mon, Nov 25, 2019 at 03:36:21PM -0500, Ryan Stone wrote: > Remove "using namespace std;" from your program. I am don't have "using namespace std;". Example: === #include #include #include #include #include #include #include #include #include #include #include === Yes, only includes. c++ -c tatomic.cc In file included from tatomic.cc:11: In file included from /usr/include/net/netmap_user.h:104: In file included from /usr/include/net/netmap.h:816: /usr/include/stdatomic.h:186:17: error: unknown type name '_Bool' typedef _Atomic(_Bool) atomic_bool; ^ /usr/include/stdatomic.h:186:26: error: C++ requires a type specifier for all declarations typedef _Atomic(_Bool) atomic_bool; ~~ ^ /usr/include/stdatomic.h:379:17: error: unknown type name '_Bool' static __inline _Bool ^ /usr/include/stdatomic.h:383:10: error: address argument to atomic operation must be a pointer to _Atomic type ('volatile atomic_bool *' (aka 'volatile int *') invalid) return (atomic_exchange_explicit(&__object->__flag, 1, __order)); ^~ /usr/include/stdatomic.h:242:2: note: expanded from macro 'atomic_exchange_explicit' __c11_atomic_exchange(object, desired, order) ^ ~~ /usr/include/stdatomic.h:390:2: error: address argument to atomic operation must be a pointer to _Atomic type ('volatile atomic_bool *' (aka 'volatile int *') invalid) atomic_store_explicit(&__object->__flag, 0, __order); ^ ~ /usr/include/stdatomic.h:256:2: note: expanded from macro 'atomic_store_explicit' __c11_atomic_store(object, desired, order) ^ ~~ /usr/include/stdatomic.h:394:17: error: unknown type name '_Bool' static __inline _Bool ^ 6 errors generated. Ok, try ugly hack for _Bool: === #include #include #include #include #include #include #include #include #include #include typedefint _Bool; #include === No, errors. Now try === #include #include #include #include #include #include #include #include #include #include typedefint _Bool; #include #include #include === Many errors: In file included from tatomic.cc:13: In file included from /usr/include/c++/v1/memory:668: /usr/include/c++/v1/atomic:1166:49: error: expected ')' atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/c++/v1/atomic:1166:1: note: to match this '(' atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/stdatomic.h:176:32: note: expanded from macro 'atomic_is_lock_free' __atomic_is_lock_free(sizeof(*(obj)), obj) ^ In file included from tatomic.cc:13: In file included from /usr/include/c++/v1/memory:668: /usr/include/c++/v1/atomic:1166:1: error: expected expression atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/stdatomic.h:176:37: note: expanded from macro 'atomic_is_lock_free' __atomic_is_lock_free(sizeof(*(obj)), obj) ^ In file included from tatomic.cc:13: In file included from /usr/include/c++/v1/memory:668: /usr/include/c++/v1/atomic:1166:21: error: expected expression atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/c++/v1/atomic:1166:53: error: expected ';' at end of declaration atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/c++/v1/atomic:1166:54: error: expected unqualified-id atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/c++/v1/__config:839:21: note: expanded from macro '_NOEXCEPT' # define _NOEXCEPT noexcept ^ In file included from tatomic.cc:13: In file included from /usr/include/c++/v1/memory:668: /usr/include/c++/v1/atomic:1174:1: error: redefinition of '__atomic_is_lock_free' atomic_is_lock_free(const atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/stdatomic.h:176:2: note: expanded from macro 'atomic_is_lock_free' __atomic_is_lock_free(sizeof(*(obj)), obj) ^ /usr/include/c++/v1/atomic:1166:1: note: previous definition is here atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/stdatomic.h:176:2: note: expanded from macro 'atomic_is_lock_free' __atomic_is_lock_free(sizeof(*(obj)), obj) ^ In file included from tatomic.cc:13: In file included from /usr/include/c++/v1/memory:668: /usr/include/c++/v1/atomic:1174:40: error: expected ')' atomic_is_lock_free(const atomic<_Tp>* __o) _NOEXCEPT ^ /usr/include/c++/v1/atomic:1174:1: note: to match this '('
Access to NETMAP from c++ program
Is this posible (now) for access to NETAMP from C++? I am see headers conflict: In file included from /usr/include/net/netmap_user.h:104: In file included from /usr/include/net/netmap.h:812: /usr/include/stdatomic.h:141:21: error: reference to 'memory_order' is ambiguous atomic_thread_fence(memory_order __order __unused) ^ /usr/include/stdatomic.h:134:3: note: candidate found by name lookup is 'memory_order' } memory_order; ^ /usr/include/c++/v1/atomic:585:3: note: candidate found by name lookup is 'std::__1::memory_order' } memory_order; ^ Yes, I am need in C++ program. Include before also don't work, w/ different error. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: haproxy syslog comptible
On Mon, Jun 24, 2019 at 05:42:39PM +0300, Slawa Olhovchenkov wrote: > On Mon, Jun 24, 2019 at 10:35:03AM -0400, Paul Mather wrote: > > > On Jun 24, 2019, at 10:17 AM, Slawa Olhovchenkov wrote: > > > > > I am use haproxy logged to syslog and have log lines like this: > > > > > > Jun 24 17:04:25 ha01 haproxy[32508]: 193.34.87.146:57625 > > > [24/Jun/2019:17:04:23.277] balancer~ default-pool/main 0/0/0/-1/2012 504 > > > 194 - - sH-- 888/888/4/4/0 0/0 "POST /vs HTTP/1.1" > > > > > > Is this posible to learn syslogd to use mileseconds timestamps? > > > ___ > > > freebsd-stable@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > > > > > Run syslogd with "-O syslog" to get timestamps logged with microsecond > > precision (as well as time zones). You can add that to your > > "syslogd_flags" setting in /etc/rc.conf. (See man syslogd for details.) > > > > Note that the format of syslog entries changes with "-O syslog". You get > > logs like this: > > > > <38>1 2019-04-12T10:43:56.525458-04:00 x.x.net sshd 1253 - - > > Received signal 15; terminating. > > <38>1 2019-04-12T10:48:05.058693-04:00 x.x.net sshd 1238 - - Server > > > > listening on :: port 22. > > > > > > (Note that the precision also depends upon the client application logging > > to syslog.) > > I mean you talk about different syslogd, not from FreeBSD: > > syslogd: illegal option -- O > usage: syslogd [-468ACcdFknosTuv] [-a allowed_peer] >[-b bind_address] [-f config_file] >[-l [mode:]path] [-m mark_interval] >[-P pid_file] [-p log_socket] Ah, I am see -- I am need syslogd from FreeBSD-12, thx. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: haproxy syslog comptible
On Mon, Jun 24, 2019 at 10:35:03AM -0400, Paul Mather wrote: > On Jun 24, 2019, at 10:17 AM, Slawa Olhovchenkov wrote: > > > I am use haproxy logged to syslog and have log lines like this: > > > > Jun 24 17:04:25 ha01 haproxy[32508]: 193.34.87.146:57625 > > [24/Jun/2019:17:04:23.277] balancer~ default-pool/main 0/0/0/-1/2012 504 > > 194 - - sH-- 888/888/4/4/0 0/0 "POST /vs HTTP/1.1" > > > > Is this posible to learn syslogd to use mileseconds timestamps? > > ___ > > freebsd-stable@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > > Run syslogd with "-O syslog" to get timestamps logged with microsecond > precision (as well as time zones). You can add that to your > "syslogd_flags" setting in /etc/rc.conf. (See man syslogd for details.) > > Note that the format of syslog entries changes with "-O syslog". You get > logs like this: > > <38>1 2019-04-12T10:43:56.525458-04:00 x.x.net sshd 1253 - - > Received signal 15; terminating. > <38>1 2019-04-12T10:48:05.058693-04:00 x.x.net sshd 1238 - - Server > listening on :: port 22. > > > (Note that the precision also depends upon the client application logging > to syslog.) I mean you talk about different syslogd, not from FreeBSD: syslogd: illegal option -- O usage: syslogd [-468ACcdFknosTuv] [-a allowed_peer] [-b bind_address] [-f config_file] [-l [mode:]path] [-m mark_interval] [-P pid_file] [-p log_socket] ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
haproxy syslog comptible
I am use haproxy logged to syslog and have log lines like this: Jun 24 17:04:25 ha01 haproxy[32508]: 193.34.87.146:57625 [24/Jun/2019:17:04:23.277] balancer~ default-pool/main 0/0/0/-1/2012 504 194 - - sH-- 888/888/4/4/0 0/0 "POST /vs HTTP/1.1" Is this posible to learn syslogd to use mileseconds timestamps? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 11.3-BETA3 Now Available
On Mon, Jun 10, 2019 at 03:13:31PM +, Glen Barber wrote: > On Sat, Jun 08, 2019 at 01:39:49PM +0300, Slawa Olhovchenkov wrote: > > On Fri, Jun 07, 2019 at 10:26:34PM +, Glen Barber wrote: > > > > > The third BETA build of the 11.3-RELEASE release cycle is now available. > > > > Can some one from re@ do MFC r348772 to 11.3-RELEASE before release? > > This is important fix. > > The MFC timer for the change in question is 2 weeks, presumably to allow > time to detect any issues in 13-CURRENT before the merge is done to > stable/12 and stable/11. The change in question was committed three > days ago. > > I have CC'd the original committer, nonetheless. I am ask about include MFCed commit in RELEASE image, not in stable trunk. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 11.3-BETA3 Now Available
On Fri, Jun 07, 2019 at 10:26:34PM +, Glen Barber wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > The third BETA build of the 11.3-RELEASE release cycle is now available. Can some one from re@ do MFC r348772 to 11.3-RELEASE before release? This is important fix. Thanks ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD-11: Fatal trap 9: general protection fault while in kernel mode (in key_addref())
On Wed, Feb 27, 2019 at 11:54:20PM +0300, Slawa Olhovchenkov wrote: > Is this known issuse? > > Fatal trap 9: general protection fault while in kernel mode > cpuid = 13; apic id = 2a > instruction pointer = 0x20:0x806b6a94 > stack pointer = 0x28:0xfe2026e274f0 > frame pointer = 0x28:0xfe2026e274f0 > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags= interrupt enabled, resume, IOPL = 0 > current process = 12 (irq295: t5nex0:0a5) > trap number = 9 > panic: general protection fault > cpuid = 13 > KDB: stack backtrace: > db_trace_self_wrapper() at 0x8032667b = > db_trace_self_wrapper+0x2b/frame 0xfe2026e27130 > vpanic() at 0x804c2006 = vpanic+0x186/frame 0xfe2026e271b0 > panic() at 0x804c1e73 = panic+0x43/frame 0xfe2026e27210 > trap_fatal() at 0x807503f2 = trap_fatal+0x322/frame 0xfe2026e27260 > trap() at 0x8074fa5e = trap+0x5e/frame 0xfe2026e27420 > calltrap() at 0x80735771 = calltrap+0x8/frame 0xfe2026e27420 > --- trap 0x9, rip = 0x806b6a94, rsp = 0xfe2026e274f0, rbp = > 0xfe2026e274f0 --- > key_addref() at 0x806b6a94 = key_addref+0x4/frame 0xfe2026e274f0 > ipsec_getpcbpolicy() at 0x806b20b9 = ipsec_getpcbpolicy+0x49/frame > 0xfe2026e27530 > ipsec4_getpolicy() at 0x806b10a5 = ipsec4_getpolicy+0x25/frame > 0xfe2026e275d0 > ipsec4_in_reject() at 0x806b138b = ipsec4_in_reject+0x1b/frame > 0xfe2026e27600 > tcp_input() at 0x8066127c = tcp_input+0x97c/frame 0xfe2026e27740 > ip_input() at 0x805e447f = ip_input+0x10f/frame 0xfe2026e277a0 > netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame > 0xfe2026e277f0 > ether_demux() at 0x805b43ff = ether_demux+0x13f/frame > 0xfe2026e27820 > ether_nh_input() at 0x805b506b = ether_nh_input+0x31b/frame > 0xfe2026e27880 > netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame > 0xfe2026e278d0 > ether_input() at 0x805b4676 = ether_input+0x26/frame > 0xfe2026e278f0 > t4_eth_rx() at 0x816403b3 = t4_eth_rx+0x103/frame 0xfe2026e27910 > service_iq() at 0x81644886 = service_iq+0x4a6/frame 0xfe2026e279c0 > t4_intr() at 0x81644b3e = t4_intr+0x2e/frame 0xfe2026e279e0 > intr_event_execute_handlers() at 0x804871ac = > intr_event_execute_handlers+0xec/frame 0xfe2026e27a20 > ithread_loop() at 0x80487846 = ithread_loop+0xd6/frame > 0xfe2026e27a70 > fork_exit() at 0x80484805 = fork_exit+0x85/frame 0xfe2026e27ab0 > fork_trampoline() at 0x80735cae = fork_trampoline+0xe/frame > 0xfe2026e27ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > Uptime: 657d14h33m52s kgdb decode: Unread portion of the kernel message buffer: Fatal trap 9: general protection fault while in kernel mode cpuid = 13; apic id = 2a instruction pointer = 0x20:0x806b6a94 stack pointer = 0x28:0xfe2026e274f0 frame pointer = 0x28:0xfe2026e274f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12 (irq295: t5nex0:0a5) trap number = 9 panic: general protection fault cpuid = 13 KDB: stack backtrace: db_trace_self_wrapper() at 0x8032667b = db_trace_self_wrapper+0x2b/frame 0xfe2026e27130 vpanic() at 0x804c2006 = vpanic+0x186/frame 0xfe2026e271b0 panic() at 0x804c1e73 = panic+0x43/frame 0xfe2026e27210 trap_fatal() at 0x807503f2 = trap_fatal+0x322/frame 0xfe2026e27260 trap() at 0x8074fa5e = trap+0x5e/frame 0xfe2026e27420 calltrap() at 0x80735771 = calltrap+0x8/frame 0xfe2026e27420 --- trap 0x9, rip = 0x806b6a94, rsp = 0xfe2026e274f0, rbp = 0xfe2026e274f0 --- key_addref() at 0x806b6a94 = key_addref+0x4/frame 0xfe2026e274f0 ipsec_getpcbpolicy() at 0x806b20b9 = ipsec_getpcbpolicy+0x49/frame 0xfe2026e27530 ipsec4_getpolicy() at 0x806b10a5 = ipsec4_getpolicy+0x25/frame 0xfe2026e275d0 ipsec4_in_reject() at 0x806b138b = ipsec4_in_reject+0x1b/frame 0xfe2026e27600 tcp_input() at 0x8066127c = tcp_input+0x97c/frame 0xfe2026e27740 ip_input() at 0x805e447f = ip_input+0x10f/frame 0xfe2026e277a0 netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 0xfe2026e277f0 ether_demux() at 0x805b43ff = ether_demux+0x13f/frame 0xfe2026e27820 ether_nh_input() at 0x8
FreeBSD-11: Fatal trap 9: general protection fault while in kernel mode (in key_addref())
Is this known issuse? Fatal trap 9: general protection fault while in kernel mode cpuid = 13; apic id = 2a instruction pointer = 0x20:0x806b6a94 stack pointer = 0x28:0xfe2026e274f0 frame pointer = 0x28:0xfe2026e274f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12 (irq295: t5nex0:0a5) trap number = 9 panic: general protection fault cpuid = 13 KDB: stack backtrace: db_trace_self_wrapper() at 0x8032667b = db_trace_self_wrapper+0x2b/frame 0xfe2026e27130 vpanic() at 0x804c2006 = vpanic+0x186/frame 0xfe2026e271b0 panic() at 0x804c1e73 = panic+0x43/frame 0xfe2026e27210 trap_fatal() at 0x807503f2 = trap_fatal+0x322/frame 0xfe2026e27260 trap() at 0x8074fa5e = trap+0x5e/frame 0xfe2026e27420 calltrap() at 0x80735771 = calltrap+0x8/frame 0xfe2026e27420 --- trap 0x9, rip = 0x806b6a94, rsp = 0xfe2026e274f0, rbp = 0xfe2026e274f0 --- key_addref() at 0x806b6a94 = key_addref+0x4/frame 0xfe2026e274f0 ipsec_getpcbpolicy() at 0x806b20b9 = ipsec_getpcbpolicy+0x49/frame 0xfe2026e27530 ipsec4_getpolicy() at 0x806b10a5 = ipsec4_getpolicy+0x25/frame 0xfe2026e275d0 ipsec4_in_reject() at 0x806b138b = ipsec4_in_reject+0x1b/frame 0xfe2026e27600 tcp_input() at 0x8066127c = tcp_input+0x97c/frame 0xfe2026e27740 ip_input() at 0x805e447f = ip_input+0x10f/frame 0xfe2026e277a0 netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 0xfe2026e277f0 ether_demux() at 0x805b43ff = ether_demux+0x13f/frame 0xfe2026e27820 ether_nh_input() at 0x805b506b = ether_nh_input+0x31b/frame 0xfe2026e27880 netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 0xfe2026e278d0 ether_input() at 0x805b4676 = ether_input+0x26/frame 0xfe2026e278f0 t4_eth_rx() at 0x816403b3 = t4_eth_rx+0x103/frame 0xfe2026e27910 service_iq() at 0x81644886 = service_iq+0x4a6/frame 0xfe2026e279c0 t4_intr() at 0x81644b3e = t4_intr+0x2e/frame 0xfe2026e279e0 intr_event_execute_handlers() at 0x804871ac = intr_event_execute_handlers+0xec/frame 0xfe2026e27a20 ithread_loop() at 0x80487846 = ithread_loop+0xd6/frame 0xfe2026e27a70 fork_exit() at 0x80484805 = fork_exit+0x85/frame 0xfe2026e27ab0 fork_trampoline() at 0x80735cae = fork_trampoline+0xe/frame 0xfe2026e27ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Uptime: 657d14h33m52s ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
ZFS boot code regression
1. gptzfsboot from 12 incompatible w/ loader from 11 ("kernel not found") 2. loader from 12 incomatibe w/ kernel from 11 (ZFS file system unknown) ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
FreeBSD-12 build question
1. How I can build release media of FreeBSD-12 on FreeBSD-11 system? Currenly process failed by 'Abort trap'. 585191 121 -rw---1 root wheel 8962048 Dec 5 18:58 ./ldconfig.core 585199 121 -rw---1 root wheel 8953856 Dec 5 18:58 ./usr/obj/usr/src/amd64.amd64/mktemp.core 585200 249 -rw---1 root wheel 9641984 Dec 5 18:58 ./usr/obj/usr/src/amd64.amd64/make.core 2. How I can update FreeBSD-11 (ZFS on Root) to FreeBSD-12 from source? With new kernel ZFS not mounted, /bin/sh crashed. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: iostat busy value calculation
On Wed, Jun 20, 2018 at 07:37:20PM +0200, Miroslav Lachman wrote: > > %busy comes from the devstat layer. It's defined as the percent of the > > time over the polling interval in which at least one transaction was > > awaiting completion by the lower layers. It's an imperfect measure of > > how busy the drives are (in ye-olden days, before tagged queuing and > > NCQ, it was OK because you had THE transaction pending and it was a good > > measure of how utilized things were. Now with concurrent I/O in flash > > devices, it's only an imperfect approximation). > > Yes, I am aware of this issue. This percentage is just "is it slightly > loaded or heavily loaded" indicator. for "heavily loaded" use average transaction time and average queue length ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS+find(1) wiring all RAM
On Thu, Jun 07, 2018 at 07:04:29PM +0930, Shane Ambler wrote: > On 07/06/2018 16:09, Peter Jeremy wrote: > > I've noticed that 11-stable/amd64 has been wiring seemingly excessive > > amounts of RAM for some time (the problem goes back at least 6 months). > > This extends to getting ENOMEM errors from g_io_deliver() and out-of-swap > > errors killing processes on a low-memory system. I'm not sure when it > > started by it seems to hawe gotten worse between r331535 and r334494. > > Don't know if this will help you at all -- > > I have seen excess wired for a few years, since 10.1, I now run > 11-stable, my experience has seen the severity varying over time. > > I first reported this 28/10/2014 related to heavy disk use on a zpool. > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194654 > > The inability to get solid repeatable steps to reproduce have prevented > me from chasing this more. Can you try https://reviews.freebsd.org/D7538 ? I am try in this patch to resolve trouble w/ wired and unused memory by ARC. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
vmstat -m stranges
# vmstat -m|grep temp Type InUse MemUse HighUse Requests Size(s) temp60 18014398509481829K - 32350974 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536 Is this normal? SVN rev: r328463 ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: KBI unexpexted change in stable/11 ?
On Wed, Mar 28, 2018 at 11:25:08PM -0700, Kevin Oberman wrote: > > > > r325665 is previos point and is good. > > > > r331615 crashed. > > > > Can I use some script for bisect? > > > > > > I'm not aware of a script for this. The only tool I've used is "git > > > bisect", which is very handy if you're already familiar with git. > > > > You may want to try devel/p5-App-SVN-Bisect. Never used it, so > > no idea if it's functional or helpful, just found it doing a quick > > search > > It would be nice if this could be fixed, but it is the case. r328475 bad (tzdata) r328469 in progress (kib, sys/vm) r328463 in progress (don't touch kernel) r328462 good I mean r328469 break KBI. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: KBI unexpexted change in stable/11 ?
On Wed, Mar 28, 2018 at 10:29:10AM -0500, Eric van Gyzen wrote: > On 03/28/2018 08:09, Slawa Olhovchenkov wrote: > > I am upgrade system to latest -STABLE and now see kernel crash: > > > > - loading virtualbox modules build on 11.1-RELEASE-p6 > > - loading nvidia module build on 11.1-RELEASE-p6 and start xdm > > > > Is this expected? I am mean about loading modules builded on > > 11.1-RELEASE on any 11.1-STABLE. > > This is not expected. Can you bisect to find the stable/11 commit that > broke this? > > If you can roll back to 11.1-RELEASE, you could probably just > buildkernel and installkernel from various points along stable/11. That > would save a lot of time by avoiding buildworld. r325665 is previos point and is good. r331615 crashed. Can I use some script for bisect? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: KBI unexpexted change in stable/11 ?
On Wed, Mar 28, 2018 at 05:13:48PM +0200, Gregory Byshenk wrote: > On Wed, Mar 28, 2018 at 05:35:51PM +0300, Slawa Olhovchenkov wrote: > > On Wed, Mar 28, 2018 at 03:39:46PM +0200, Gregory Byshenk wrote: > > > > > > Did you rebuild your virtualbox and nvidia modules for your new > > > kernel? If you build a new kernel, then you need to rebuild any > > > modules that were installed from ports for the new version. > > > > Only for -CURRENT, not for -STABLE: > > > > https://lists.freebsd.org/pipermail/svn-src-all/2018-March/159649.html > > > > John Baldwin: "In theory we try to not break existing kernel > > modules on a stable branch. That is, one should be able to kldload an > > if_iwn.ko built on 11.0 on a 11-stable kernel." > > I understand that this is true for standard kernel modules, but > I have frequently run into problems loading _ports_ modules on > a new kernel - including in STABLE. This is general _kernel_ rule, for loading all modules, mostly for third-party modules. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: KBI unexpexted change in stable/11 ?
On Wed, Mar 28, 2018 at 03:39:46PM +0200, Gregory Byshenk wrote: > On Wed, Mar 28, 2018 at 04:09:04PM +0300, Slawa Olhovchenkov wrote: > > I am upgrade system to latest -STABLE and now see kernel crash: > > > > - loading virtualbox modules build on 11.1-RELEASE-p6 > > - loading nvidia module build on 11.1-RELEASE-p6 and start xdm > > > > Is this expected? I am mean about loading modules builded on > > 11.1-RELEASE on any 11.1-STABLE. > > Did you rebuild your virtualbox and nvidia modules for your new > kernel? If you build a new kernel, then you need to rebuild any > modules that were installed from ports for the new version. Only for -CURRENT, not for -STABLE: https://lists.freebsd.org/pipermail/svn-src-all/2018-March/159649.html John Baldwin: "In theory we try to not break existing kernel modules on a stable branch. That is, one should be able to kldload an if_iwn.ko built on 11.0 on a 11-stable kernel." ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
KBI unexpexted change in stable/11 ?
I am upgrade system to latest -STABLE and now see kernel crash: - loading virtualbox modules build on 11.1-RELEASE-p6 - loading nvidia module build on 11.1-RELEASE-p6 and start xdm Is this expected? I am mean about loading modules builded on 11.1-RELEASE on any 11.1-STABLE. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 11.1 ixl(4) interface does not negotiate at 100 Mbit/s
On Mon, Mar 19, 2018 at 03:53:03PM +0100, Patrick M. Hausen wrote: > Hi all, > > any ideas why a current RELENG_11_1 system with ixl(4) > onboard interfaces might not negotiate with a switch that > has only fast ethernet? > > status: no carrieron the host > line protocol is down (notconnect)on the switch > > dmesg: > https://imgur.com/9ri9is8 "Intel® Ethernet Controller X710/XXV710/XL710 Feature Support Matrix" don't show support 100Mb link for any sw release for X710/XL710. "Intel® Ethernet Connection X722 Feature Support Matrix" don't have string "100" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)
On Tue, Aug 08, 2017 at 01:49:08PM +0200, Hans Petter Selasky wrote: > On 08/08/17 13:33, Slawa Olhovchenkov wrote: > > TW_RUNLOCK(V_tw_lock); > > and > > if (INP_INFO_TRY_WLOCK(_tcbinfo)) { > > > > `inp` can be invalidated, freed and this pointer may be invalid? > > If you look one line up there is a pcbref ?? Yes. Can different thread take this inp and freed it? May be timer thread? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)
On Tue, Aug 08, 2017 at 10:31:33AM +0200, Hans Petter Selasky wrote: > Here is the conclusion: > > The following code is going in an infinite loop: > > > > for (;;) { > > TW_RLOCK(V_tw_lock); > > tw = TAILQ_FIRST(_twq_2msl); > > if (tw == NULL || (!reuse && (tw->tw_time - ticks) > 0)) { > > TW_RUNLOCK(V_tw_lock); > > break; > > } > > KASSERT(tw->tw_inpcb != NULL, ("%s: tw->tw_inpcb == NULL", > > __func__)); > > > > inp = tw->tw_inpcb; > > in_pcbref(inp); > > TW_RUNLOCK(V_tw_lock); > > > > if (INP_INFO_TRY_RLOCK(_tcbinfo)) { > > > > INP_WLOCK(inp); > > tw = intotw(inp); > > if (in_pcbrele_wlocked(inp)) { > > in_pcbrele_wlocked() returns (1) because INP_FREED (16) is set in > inp->inp_flags2. I guess you have invariants disabled, because the > KASSERT() below should have caused a panic. > > > KASSERT(tw == NULL, ("%s: held last inp " > > "reference but tw not NULL", __func__)); > > INP_INFO_RUNLOCK(_tcbinfo); > > continue; > > } > > This is a regression issue after: > > > commit 5630210a7f1dbbd903b77b2aef939cd47c63da58 > > Author: jch> > Date: Thu Oct 30 08:53:56 2014 + > > > > Fix a race condition in TCP timewait between tcp_tw_2msl_reuse() and > > tcp_tw_2msl_scan(). This race condition drives unplanned timewait > > timeout cancellation. Also simplify implementation by holding inpcb > > reference and removing tcptw reference counting. > > Suggested fix attached. Hmm, I am not sure, IMHO between TW_RUNLOCK(V_tw_lock); and if (INP_INFO_TRY_WLOCK(_tcbinfo)) { `inp` can be invalidated, freed and this pointer may be invalid? > Index: sys/netinet/tcp_timewait.c > === > --- sys/netinet/tcp_timewait.c(revision 321981) > +++ sys/netinet/tcp_timewait.c(working copy) > @@ -709,10 +709,11 @@ > INP_WLOCK(inp); > tw = intotw(inp); > if (in_pcbrele_wlocked(inp)) { > - KASSERT(tw == NULL, ("%s: held last inp " > - "reference but tw not NULL", __func__)); > INP_INFO_RUNLOCK(_tcbinfo); > - continue; > + if (tw == NULL) > + continue; > + else > + break; /* INP_FREED flag is set */ > } > > if (tw == NULL) { > ___ > freebsd-...@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Mega ZFS MFCs
On Thu, Jul 27, 2017 at 04:29:52PM +0300, Alexander Motin wrote: > Hi Mike, > > On 27.07.2017 16:21, Mike Tancsa wrote: > > I noticed quite a few MFCs to RELENG_11 around zfs yesterday and today. > > First off, thank you for all these fixes/enhancements! Of the some 60 > > MFCs, are there any particular ones to be more aware of when updating > > servers ? > > The most complicated and invasive to me looks r321610 "8021 ARC buf data > scatter-ization". It took 5 fix commits to make it behave in head, but > Andriy told me it should be good now, and I run it on my systems too. > > > Are there any more to come, or is now a good time to test things out ? > > I've merged all we had in head (except couple gptzfsboot commits > significantly increasing its size, that could break POLA). Next round > will any way go to head first, so stable/11 should probably be idle for > a month at least and should be good for testing now. Ant chance for PR218043 and D7538? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Lock contention in AIO
On Wed, Apr 12, 2017 at 05:21:02PM -0700, Adrian Chadd wrote: > It's the same pages, right? Perhaps. > Is it just the refcounting locking that's > causing it? Don't know. > I think the biggest thing here is to figure out how to have pages have > a lifecycle where the refcount can be inc/dec (obviously >1, ie not in > a state where you can dec to 0) via atomics, without grabbing a lock. > That'll make this particular use case mch faster. > > (dfbsd does this.) I can try you patch. > -a > > > On 21 March 2017 at 09:42, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: > > I am see lock contetntion cuased by aio read (same file segment from > > multiple process simultaneous): > > > > 07.74% [26756]lock_delay @ /boot/kernel/kernel > > 92.21% [24671] __mtx_lock_sleep > > 52.14% [12864] vm_page_enqueue > >100.0% [12864] vm_fault_hold > > 87.71% [11283]vm_fault_quick_hold_pages > > 100.0% [11283] vn_io_fault1 > > 100.0% [11283] vn_io_fault > >99.88% [11270] aio_process_rw > > 100.0% [11270]aio_daemon > > 100.0% [11270] fork_exit > >00.12% [13] dofileread > > 100.0% [13] kern_readv > > > > Is this know problem? > > ___ > > freebsd-stable@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
/dev/dri registration
I am have strange issuse on stable/10: # devinfo -v nexus0 apic0 ram0 acpi0 [...] pcib0 pnpinfo _HID=PNP0A08 _UID=0 at handle=\_SB_.PCI0 pci0 hostb0 pnpinfo vendor=0x8086 device=0xd130 subvendor=0x1014 subdevice=0x03ce class=0x06 at slot=0 function=0 dbsf=pci0:0:0:0 pcib1 pnpinfo vendor=0x8086 device=0xd138 subvendor=0x1014 subdevice=0x03ce class=0x060400 at slot=3 function=0 dbsf=pci0:0:3:0 handle=\_SB_.PCI0.P0P2 pci1 vgapci0 pnpinfo vendor=0x10de device=0x0a20 subvendor=0x1458 subdevice=0x34d6 class=0x03 at slot=0 function=0 dbsf=pci0:1:0:0 drm0 drmn0 nvidia0 But /dev/dri don't exist! # kldstat Id Refs AddressSize Name 1 80 0x8020 17e87f8 kernel 21 0x819e9000 309780 zfs.ko 32 0x81cf3000 6040 opensolaris.ko 41 0x81cfa000 7aa58if_em.ko 51 0x81d75000 29bd0drm.ko 61 0x81d9f000 82898drm2.ko 72 0x81e22000 6298 iicbus.ko 81 0x81e29000 1c650uart.ko 91 0x82011000 56f3 fdescfs.ko 101 0x82017000 a681 linprocfs.ko 113 0x82022000 7522 linux_common.ko 121 0x8202a000 5673 linsysfs.ko 131 0x8203 364c ums.ko 141 0x82034000 10226snd_uaudio.ko 151 0x82045000 2ba8 uhid.ko 163 0x82048000 4e626vboxdrv.ko 172 0x82097000 2b82 vboxnetflt.ko 182 0x8209a000 ba2f netgraph.ko 191 0x820a6000 414f ng_ether.ko 201 0x820ab000 3fd4 vboxnetadp.ko 212 0x820af000 3d5dalinux.ko 221 0x820ed000 964496 nvidia.ko What is wrong in may setup? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Lock contention in AIO
I am see lock contetntion cuased by aio read (same file segment from multiple process simultaneous): 07.74% [26756]lock_delay @ /boot/kernel/kernel 92.21% [24671] __mtx_lock_sleep 52.14% [12864] vm_page_enqueue 100.0% [12864] vm_fault_hold 87.71% [11283]vm_fault_quick_hold_pages 100.0% [11283] vn_io_fault1 100.0% [11283] vn_io_fault 99.88% [11270] aio_process_rw 100.0% [11270]aio_daemon 100.0% [11270] fork_exit 00.12% [13] dofileread 100.0% [13] kern_readv Is this know problem? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: about that DFBSD performance test
On Wed, Mar 08, 2017 at 09:00:34AM +0500, Eugene M. Zheganin wrote: > Hi. > > Some have probably seen this already - > http://lists.dragonflybsd.org/pipermail/users/2017-March/313254.html > > So, could anyone explain why FreeBSD was owned that much. Test is split > into two parts, one is nginx part, and the other is the IPv4 forwarding three: UFS part. And multiple simulations access to same file/block can cause page lock congestion. > part. I understand that nginx ownage was due to SO_REUSEPORT feature, > which we do formally have, but in DFBSD and Linux it does provide a > kernel socket multiplexor, which eliminates locking, and ours does not. > I have only found traces of discussion that DFBSD implementation is too > hackish. Well, hackish or not, but it's 4 times faster, as it turns out. > The IPv4 forwarding loss is pure defeat though. > > Please not that although they use HEAD it these tests, they also mention > that this is the GENERIC-NODEBUG kernel which means this isn't related > to the WITNESS stuff. > > Please also don't consider this trolling, I'm a big FreeBSD fan through > the years, so I'm asking because I'm kind of concerned. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: slow machine, swap in use, but more than 5GB of RAM inactive
On Tue, Mar 07, 2017 at 10:19:35AM +0800, Erich Dollansky wrote: > Hi, > > I wonder about the slow speed of my machine while top shows ample > inactive memory: > > last pid: 85287; load averages: 2.56, 2.44, 1.68 > up 6+10:24:45 10:13:36 191 processes: 5 running, 186 sleeping > CPU 0: 47.1% user, 0.0% nice, 51.4% system, 0.0% interrupt, 1.6% idle > CPU 1: 38.4% user, 0.0% nice, 60.4% system, 0.0% interrupt, 1.2% idle > CPU 2: 38.8% user, 0.0% nice, 59.2% system, 0.0% interrupt, 2.0% idle > CPU 3: 45.5% user, 0.0% nice, 51.0% system, 0.4% interrupt, 3.1% idle > Mem: 677M Active, 5600M Inact, 1083M Wired, 178M Cache, 816M Buf,301M > Free > Swap: 16G Total, 1352M Used, 15G Free, 8% Inuse > > The swap space in use can be explained by large compilations done > recently. Why is the inactive memory not put to use. > > I do not want to restart the machine. So, if I could help find the > source of the problem, I would do. inactive is not 'not used' memory. this is just pages don't touched in last 10(?) seconds, but all of this allocated (such as malloc, mmap, sendfile) to application (userland programs). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Is it known problem, that zfs.ko could not be built with system compiler (clang 3.9.1) without optimization?
On Wed, Feb 22, 2017 at 11:47:42PM +0300, Lev Serebryakov wrote: > Hello Freebsd-stable, > >Now if you build zfs.ko with -O0 it panics on boot. > >If you use default optimization level, a lot of fbt DTreace probes are > missing. Is this related to http://llvm.org/bugs/show_bug.cgi?id=18420 ? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 11.0-STABLE #0 r310265 amd64 seems to be cpi-ing garbage to mounted FAT32 fs after 10-20 GB.
On Wed, Feb 01, 2017 at 07:25:18AM -0700, Jakub Lach wrote: > I would think so, if only I would not clone the disk/system via the same USB > port mere weeks ago. > Moreover, sysutils/f3 fully writes and validates (checksums) 30G+ memory > cards via the same port without problems. In my case controller don't always be broken, only from some time. Data corruption over my USB depends on data access pattern. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD 11.0-STABLE #0 r310265 amd64 seems to be cpi-ing garbage to mounted FAT32 fs after 10-20 GB.
On Wed, Feb 01, 2017 at 06:52:01AM -0700, Jakub Lach wrote: > Yes, HDD and card reader was USB mounted. > > This time, I've copied about 12G from 38G from internal SSD (UFS2) to > HDD via USB (FAT32), then system panicked with CAM errors. I am have like issuse on laptop w/ broken USB controller. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
LACP: Fatal trap 18: integer divide fault while in kernel mode
I am got panic on recent stable: Fatal trap 18: integer divide fault while in kernel mode cpuid = 3; apic id = 06 instruction pointer = 0x20:0x81453230 stack pointer = 0x28:0xfe3e56f46480 frame pointer = 0x28:0xfe3e56f464a0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12 (swi4: clock (3)) trap number = 18 panic: integer divide fault cpuid = 3 KDB: stack backtrace: db_trace_self_wrapper() at 0x8032b3eb = db_trace_self_wrapper+0x2b/frame 0xfe3e56f460c0 vpanic() at 0x804e33a6 = vpanic+0x186/frame 0xfe3e56f46140 panic() at 0x804e3213 = panic+0x43/frame 0xfe3e56f461a0 trap_fatal() at 0x807b07c2 = trap_fatal+0x322/frame 0xfe3e56f461f0 trap() at 0x807b0475 = trap+0x6b5/frame 0xfe3e56f463b0 calltrap() at 0x807946b1 = calltrap+0x8/frame 0xfe3e56f463b0 --- trap 0x12, rip = 0x81453230, rsp = 0xfe3e56f46480, rbp = 0xfe3e56f464a0 --- lacp_select_tx_port() at 0x81453230 = lacp_select_tx_port+0x70/frame 0xfe3e56f464a0 lagg_lacp_start() at 0x814504ae = lagg_lacp_start+0xe/frame 0xfe3e56f464c0 lagg_transmit() at 0x8144e73f = lagg_transmit+0xbf/frame 0xfe3e56f46530 ether_output() at 0x805f30bc = ether_output+0x71c/frame 0xfe3e56f465d0 ip_output() at 0x80629935 = ip_output+0x1585/frame 0xfe3e56f46720 tcp_output() at 0x806b9e16 = tcp_output+0x1876/frame 0xfe3e56f468c0 tcp_timer_rexmt() at 0x806c572f = tcp_timer_rexmt+0x4df/frame 0xfe3e56f46900 softclock_call_cc() at 0x804fd1b6 = softclock_call_cc+0x156/frame 0xfe3e56f469b0 softclock() at 0x804fd754 = softclock+0x94/frame 0xfe3e56f469e0 intr_event_execute_handlers() at 0x8049d15f = intr_event_execute_handlers+0x20f/frame 0xfe3e56f46a20 ithread_loop() at 0x8049d766 = ithread_loop+0xc6/frame 0xfe3e56f46a70 fork_exit() at 0x80499e25 = fork_exit+0x85/frame 0xfe3e56f46ab0 fork_trampoline() at 0x80794bee = fork_trampoline+0xe/frame 0xfe3e56f46ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- (kgdb) info line *0x81453230 Line 848 of "/usr/src/sys/modules/if_lagg/../../net/ieee8023ad_lacp.c" starts at address 0x8145322eand ends at 0x81453233 . Do I need to create PR? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: decent 40G network adapters
On Wed, Jan 18, 2017 at 02:48:19PM +0500, Eugene M. Zheganin wrote: > Hi. > > Could someone recommend a decent 40Gbit adapter that are proven to be > working under FreeBSD ? The intended purpose - iSCSI traffic, not much > pps, but rates definitely above 10G. I've tried Supermicro-manufactured > Intel XL710 ones (two boards, different servers - same sad story: > packets loss, server unresponsive, spikes), seems like they have a > problem in a driver (or firmware), and though Intel support states this > is because the Supermicro tampered with the adapter, I'm still > suspicious about ixl(4). I've also seen in the ML a guy reported the > exact same problem with ixl(4) as I have found. > > So, what would you say ? Chelsio ? I am use Chelsio and Solarflare. Not sure about you workload -- I am have 40K+ TCP connections, you workload need different tuning. Do you planed to utilise both ports? For this case you need PCIe 16x card. This is Chelsio T6 and Solarflare 9200. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: buildworld build times 10-stable vs. 11-stable
On Sun, Jan 15, 2017 at 10:40:42AM -0600, Dan Mack wrote: > I have a system which builds world, kernel, install, boot, installworld, > reboot several times per week. I just noticed that my build times > increased from about (just cherry picking a couple build logs): > >Starting build of FreeBSD SVN [309852] 10.3-STABLE >Kernel will be GENERIC > building world ... 90:35 0 > > >Starting build of FreeBSD SVN [312099] 11.0-STABLE >Kernel will be GENERIC > building world ... 146:23 0 > > before I start bisecting the log files, is there something obvious > introduced in 11 that I missed that would explain the roughly 50 minute > difference in my build times? clang? additional subsystems? lldb/clang and related. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
dev.cpu.0.freq/dev.cpu.0.freq_levels support on E5v4
I am have stable/11 and E5v4. I am don't see cpufreq support by sysctl: # sysctl dev.cpu.0 dev.cpu.0.cx_method: C1/hlt dev.cpu.0.cx_usage_counters: 61755 dev.cpu.0.cx_usage: 100.00% last 1us dev.cpu.0.cx_lowest: C2 dev.cpu.0.cx_supported: C1/1/1 dev.cpu.0.%parent: acpi0 dev.cpu.0.%pnpinfo: _HID=ACPI0007 _UID=0 dev.cpu.0.%location: handle=\_SB_.SCK0.CP00 _PXM=0 dev.cpu.0.%driver: cpu dev.cpu.0.%desc: ACPI CPU # grep -i freq /var/run/dmesg.boot Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 350 Event timer "HPET1" frequency 14318180 Hz quality 340 Event timer "HPET2" frequency 14318180 Hz quality 340 Event timer "HPET3" frequency 14318180 Hz quality 340 Event timer "HPET4" frequency 14318180 Hz quality 340 Event timer "HPET5" frequency 14318180 Hz quality 340 Event timer "HPET6" frequency 14318180 Hz quality 340 Event timer "HPET7" frequency 14318180 Hz quality 340 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 est0: on cpu0 est1: on cpu1 est2: on cpu2 est3: on cpu3 est4: on cpu4 est5: on cpu5 est6: on cpu6 est7: on cpu7 est8: on cpu8 est9: on cpu9 est10: on cpu10 est11: on cpu11 est12: on cpu12 est13: on cpu13 est14: on cpu14 est15: on cpu15 est16: on cpu16 est17: on cpu17 est18: on cpu18 est19: on cpu19 est20: on cpu20 est21: on cpu21 est22: on cpu22 est23: on cpu23 Timecounter "TSC-low" frequency 1100023294 Hz quality 1000 How to enable cpufreq support for powerd? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)
On Sat, Dec 17, 2016 at 05:12:13PM +1100, Ian Smith wrote: > On Fri, 16 Dec 2016 18:08:34 +0100, Fernando Herrero Carrón wrote: > > Hi everyone, > > Hi, > > you've had plenty of helpful responses, but nobody has commented on: > > > My only reason for wanting to boot with UEFI is faster boot, > > everything is working fine otherwise. > > I'm skeptical that UEFI boot would be any or noticeably faster than via > BIOS, but am interested in hearing of any experiences regarding that. Some BIOS start with very long try UEFI boot atempt and try legacy boot only all of that fails. I.e. this is not speedup FreeBSD boot, this is speedup _start_ of FreeBSD boot for some BIOS. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)
On Fri, Dec 16, 2016 at 11:43:18AM -0600, Eric van Gyzen wrote: > On 12/16/2016 11:39, Slawa Olhovchenkov wrote: > > On Fri, Dec 16, 2016 at 06:08:34PM +0100, Fernando Herrero Carrón wrote: > > > >> Hi everyone, > >> > >> A few months ago I got myself a new box and I have been happily running > >> FreeBSD on it ever since. I noticed that the boot was not as fast as I had > >> expected and I've realized that, while my disk is GPT partitioned, the boot > >> process is still BIOS based: > >> > >> % gpart show > >> => 34 976773101 ada0 GPT (466G) > >> 34 6- free - (3.0K) > >> 40 1024 1 freebsd-boot (512K) > >>1064984- free - (492K) > >>2048 67108864 2 freebsd-swap (32G) > >>67110912 909662208 3 freebsd-zfs (434G) > >> 976773120 15- free - (7.5K) > >> > >> I am reading uefi(8) and it looks like FreeBSD 11 should be able to boot > >> using UEFI straight into ZFS, so I am thinking of converting that > >> freebsd-boot partition to an EFI partition, creating a FAT filesystem and > >> copying /boot/boot.efi there. > >> > >> How good of an idea is that? Would it really be that simple or am I missing > >> something? My only reason for wanting to boot with UEFI is faster boot, > >> everything is working fine otherwise. > >> > >> Thanks in advance for your help. > > > > I am also interesting by this case. > > I think expand freebsd-boot to about 1M (size of /boot/boot1.efifat), > > dding /boot/boot1.efifat and set to type to 'efi' may be enough. I am > > never tried this. > > I expect that would work. It's slightly risky, though, since it doesn't let > you > fall back to BIOS boot if EFI doesn't work. Live cd/USB can be fallback for this case. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)
On Fri, Dec 16, 2016 at 06:08:34PM +0100, Fernando Herrero Carrón wrote: > Hi everyone, > > A few months ago I got myself a new box and I have been happily running > FreeBSD on it ever since. I noticed that the boot was not as fast as I had > expected and I've realized that, while my disk is GPT partitioned, the boot > process is still BIOS based: > > % gpart show > => 34 976773101 ada0 GPT (466G) > 34 6- free - (3.0K) > 40 1024 1 freebsd-boot (512K) >1064984- free - (492K) >2048 67108864 2 freebsd-swap (32G) >67110912 909662208 3 freebsd-zfs (434G) > 976773120 15- free - (7.5K) > > I am reading uefi(8) and it looks like FreeBSD 11 should be able to boot > using UEFI straight into ZFS, so I am thinking of converting that > freebsd-boot partition to an EFI partition, creating a FAT filesystem and > copying /boot/boot.efi there. > > How good of an idea is that? Would it really be that simple or am I missing > something? My only reason for wanting to boot with UEFI is faster boot, > everything is working fine otherwise. > > Thanks in advance for your help. I am also interesting by this case. I think expand freebsd-boot to about 1M (size of /boot/boot1.efifat), dding /boot/boot1.efifat and set to type to 'efi' may be enough. I am never tried this. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: gdb broken on stable/11 and current?
On Thu, Dec 08, 2016 at 07:56:03PM +0200, Andriy Gapon wrote: > On 08/12/2016 18:57, Slawa Olhovchenkov wrote: > > kgdb7111 don't find .debug under /usr/lib/debug/ > > gdb found it. > > $ gdb7111 bhyve /var/coredumps/bhyve.0.0.core > GNU gdb (GDB) 7.11.1 [GDB v7.11.1 for FreeBSD] > Copyright (C) 2016 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-portbld-freebsd11.0". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from bhyve...Reading symbols from > /usr/lib/debug//usr/sbin/bhyve.debug...done. > > Something is wrong in your environment. May be outdated information, last unsuccess try will be about august. Now work ok. Thanks for point. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: gdb broken on stable/11 and current?
On Thu, Dec 08, 2016 at 04:52:35PM +, K. Macy wrote: > kgdb7111 is what you use for kernel. It works fine for me. kgdb7111 don't find .debug under /usr/lib/debug/ gdb found it. > On Thu, Dec 8, 2016 at 08:29 Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: > > > On Thu, Dec 08, 2016 at 04:01:04PM +, K. Macy wrote: > > > > > > > > > In tree gdb doesn't work for much of anything these days. It can't even > > > > > consistently give a complete kernel backtrace. Jhb is graciously > > > > > maintaining gdb in ports. It will be installed as the awkwardly named > > > > > gdb7111 IIRC. > > > > > > > > 1. gdb7111 badly integrated w/ 11 and up (don't see kernel debug > > > > symbols) > > > > 2. all included in base systems can't be core dumped. > > > > > > > > > On Thu, Dec 8, 2016 at 06:53 Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: > > > > > > > > > > > % gdb ./edge_stat > > > > > > > > > > > > GNU gdb 6.1.1 [FreeBSD] > > > > > > > > > > > > Copyright 2004 Free Software Foundation, Inc. > > > > > > > > > > > > GDB is free software, covered by the GNU General Public License, and > > you > > > > > > are > > > > > > > > > > > > welcome to change it and/or distribute copies of it under certain > > > > > > conditions. > > > > > > > > > > > > Type "show copying" to see the conditions. > > > > > > > > > > > > There is absolutely no warranty for GDB. Type "show warranty" for > > details. > > > > > > > > > > > > This GDB was configured as "amd64-marcel-freebsd"... > > > > > > > > > > > > (gdb) break main > > > > > > > > > > > > Segmentation fault (core dumped) > > > > > > > > > > > > > > > > > > > > > > > > % gdb /usr/bin/gdb /tmp/gdb.13573.core > > > > > > > > > > > > GNU gdb 6.1.1 [FreeBSD] > > > > > > > > > > > > Copyright 2004 Free Software Foundation, Inc. > > > > > > > > > > > > GDB is free software, covered by the GNU General Public License, and > > you > > > > > > are > > > > > > > > > > > > welcome to change it and/or distribute copies of it under certain > > > > > > conditions. > > > > > > > > > > > > Type "show copying" to see the conditions. > > > > > > > > > > > > There is absolutely no warranty for GDB. Type "show warranty" for > > details. > > > > > > > > > > > > This GDB was configured as "amd64-marcel-freebsd"...(no debugging > > symbols > > > > > > found)... > > > > > > > > > > > > Core was generated by `gdb'. > > > > > > > > > > > > Program terminated with signal 11, Segmentation fault. > > > > > > > > > > > > Reading symbols from /lib/libm.so.5...(no debugging symbols > > found)...done. > > > > > > > > > > > > Loaded symbols for /lib/libm.so.5 > > > > > > > > > > > > Reading symbols from /lib/libncursesw.so.8...(no debugging symbols > > > > > > found)...done. > > > > > > > > > > > > Loaded symbols for /lib/libncursesw.so.8 > > > > > > > > > > > > Reading symbols from /usr/lib/libgnuregex.so.5...(no debugging symbols > > > > > > found)...done. > > > > > > > > > > > > Loaded symbols for /usr/lib/libgnuregex.so.5 > > > > > > > > > > > > Reading symbols from /lib/libc.so.7...(no debugging symbols > > found)...done. > > > > > > > > > > > > Loaded symbols for /lib/libc.so.7 > > > > > > > > > > > > Reading symbols from /usr/lib/libthread_db.so...(no debugging symbols > > > > > > found)...done. > > > > > > > > > > > > Loaded symbols for /usr/lib/libthread_db.so > > > > > > > > > > > > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols
Re: gdb broken on stable/11 and current?
On Thu, Dec 08, 2016 at 04:01:04PM +, K. Macy wrote: > In tree gdb doesn't work for much of anything these days. It can't even > consistently give a complete kernel backtrace. Jhb is graciously > maintaining gdb in ports. It will be installed as the awkwardly named > gdb7111 IIRC. 1. gdb7111 badly integrated w/ 11 and up (don't see kernel debug symbols) 2. all included in base systems can't be core dumped. > On Thu, Dec 8, 2016 at 06:53 Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: > > > % gdb ./edge_stat > > > > GNU gdb 6.1.1 [FreeBSD] > > > > Copyright 2004 Free Software Foundation, Inc. > > > > GDB is free software, covered by the GNU General Public License, and you > > are > > > > welcome to change it and/or distribute copies of it under certain > > conditions. > > > > Type "show copying" to see the conditions. > > > > There is absolutely no warranty for GDB. Type "show warranty" for details. > > > > This GDB was configured as "amd64-marcel-freebsd"... > > > > (gdb) break main > > > > Segmentation fault (core dumped) > > > > > > > > % gdb /usr/bin/gdb /tmp/gdb.13573.core > > > > GNU gdb 6.1.1 [FreeBSD] > > > > Copyright 2004 Free Software Foundation, Inc. > > > > GDB is free software, covered by the GNU General Public License, and you > > are > > > > welcome to change it and/or distribute copies of it under certain > > conditions. > > > > Type "show copying" to see the conditions. > > > > There is absolutely no warranty for GDB. Type "show warranty" for details. > > > > This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols > > found)... > > > > Core was generated by `gdb'. > > > > Program terminated with signal 11, Segmentation fault. > > > > Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. > > > > Loaded symbols for /lib/libm.so.5 > > > > Reading symbols from /lib/libncursesw.so.8...(no debugging symbols > > found)...done. > > > > Loaded symbols for /lib/libncursesw.so.8 > > > > Reading symbols from /usr/lib/libgnuregex.so.5...(no debugging symbols > > found)...done. > > > > Loaded symbols for /usr/lib/libgnuregex.so.5 > > > > Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. > > > > Loaded symbols for /lib/libc.so.7 > > > > Reading symbols from /usr/lib/libthread_db.so...(no debugging symbols > > found)...done. > > > > Loaded symbols for /usr/lib/libthread_db.so > > > > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols > > found)...done. > > > > Loaded symbols for /libexec/ld-elf.so.1 > > > > #0 0x005da00b in cplus_demangle_v3_callback () > > > > (gdb) bt > > > > #0 0x005da00b in cplus_demangle_v3_callback () > > > > #1 0x005d9f9c in cplus_demangle_v3 () > > > > #2 0x005ca13c in cplus_demangle () > > > > #3 0x00487454 in class_name_from_physname () > > > > #4 0x0053a4f3 in dwarf2_read_section () > > > > #5 0x0053a0cc in dwarf2_read_section () > > > > #6 0x00539bd9 in dwarf2_read_section () > > > > #7 0x00537395 in dwarf2_read_section () > > > > #8 0x00539c21 in dwarf2_read_section () > > > > #9 0x00539643 in dwarf2_read_section () > > > > #10 0x00538a6c in dwarf2_read_section () > > > > #11 0x005352fb in dwarf2_read_section () > > > > #12 0x00533bfd in dwarf2_read_section () > > > > #13 0x004cfc46 in psymtab_to_symtab () > > > > #14 0x004c9cfb in lookup_symbol_global () > > > > #15 0x00482273 in cp_lookup_symbol_namespace () > > > > #16 0x00482059 in cp_lookup_symbol_nonlocal () > > > > #17 0x00481f63 in cp_lookup_symbol_nonlocal () > > > > #18 0x004c9780 in lookup_symbol () > > > > #19 0x005035cf in find_imps () > > > > #20 0x00514df9 in decode_line_1 () > > > > #21 0x00514589 in decode_line_1 () > > > > #22 0x0046eff9 in _initialize_breakpoint () > > > > #23 0x0046f48b in _initialize_breakpoint () > > > > #24 0x004ab289 in catch_exceptions () > > > > #25 0x004ab368 in catch_exceptions_with_msg () > > > > #26 0x000
gdb broken on stable/11 and current?
% gdb ./edge_stat GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... (gdb) break main Segmentation fault (core dumped) % gdb /usr/bin/gdb /tmp/gdb.13573.core GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)... Core was generated by `gdb'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libncursesw.so.8...(no debugging symbols found)...done. Loaded symbols for /lib/libncursesw.so.8 Reading symbols from /usr/lib/libgnuregex.so.5...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libgnuregex.so.5 Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /usr/lib/libthread_db.so...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libthread_db.so Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x005da00b in cplus_demangle_v3_callback () (gdb) bt #0 0x005da00b in cplus_demangle_v3_callback () #1 0x005d9f9c in cplus_demangle_v3 () #2 0x005ca13c in cplus_demangle () #3 0x00487454 in class_name_from_physname () #4 0x0053a4f3 in dwarf2_read_section () #5 0x0053a0cc in dwarf2_read_section () #6 0x00539bd9 in dwarf2_read_section () #7 0x00537395 in dwarf2_read_section () #8 0x00539c21 in dwarf2_read_section () #9 0x00539643 in dwarf2_read_section () #10 0x00538a6c in dwarf2_read_section () #11 0x005352fb in dwarf2_read_section () #12 0x00533bfd in dwarf2_read_section () #13 0x004cfc46 in psymtab_to_symtab () #14 0x004c9cfb in lookup_symbol_global () #15 0x00482273 in cp_lookup_symbol_namespace () #16 0x00482059 in cp_lookup_symbol_nonlocal () #17 0x00481f63 in cp_lookup_symbol_nonlocal () #18 0x004c9780 in lookup_symbol () #19 0x005035cf in find_imps () #20 0x00514df9 in decode_line_1 () #21 0x00514589 in decode_line_1 () #22 0x0046eff9 in _initialize_breakpoint () #23 0x0046f48b in _initialize_breakpoint () #24 0x004ab289 in catch_exceptions () #25 0x004ab368 in catch_exceptions_with_msg () #26 0x0046b169 in break_command () #27 0x004ab996 in execute_command () #28 0x00465293 in gdb_disable_readline () #29 0x00465182 in gdb_setup_readline () #30 0x005e266f in rl_callback_read_char () #31 0x00464c79 in gdb_setup_readline () #32 0x00465a46 in gdb_do_one_event () #33 0x004ab289 in catch_exceptions () #34 0x004ab42a in catch_errors () #35 0x00551168 in _initialize_tui_interp () #36 0x00448689 in gdb_main () #37 0x004ab289 in catch_exceptions () #38 0x004ab42a in catch_errors () #39 0x00448526 in gdb_main () #40 0x004ab289 in catch_exceptions () #41 0x004ab42a in catch_errors () #42 0x00447977 in gdb_main () #43 0x00447931 in main () ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
On Fri, Oct 21, 2016 at 01:47:08PM +0100, Pete French wrote: > > In bad case metadata of every file will be placed in random place of disk. > > ls need access to metadata of every file before start of output listing. > > Umm, are we not talkong abut an issue where the directoyr no longer contains > any files. It used to have lots, now it has none. > > > I.e. in bad case you will be need tens of thousands seeks over disk > > capable only 72 seeks per seconds. > > Why does it need to seek all over the disc if there are no files (and hence > no metadata surely) ? > > I am not bothered if a hufge directoyr takes a while to list, > thats something I am happy to deal with. What I dont like is > when it is back down to zero that it still takes a long time > to list. That doesnt make much sense. OK, this case may be differ. May be zdb can help. ls -li /parent/dir Take inode number zdb - zfs_set inode_number also do ktrace ls and anaylyse `kdump -E` ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
On Fri, Oct 21, 2016 at 04:51:36PM +0500, Eugene M. Zheganin wrote: > Hi. > > On 21.10.2016 15:20, Slawa Olhovchenkov wrote: > > > > ZFS prefetch affect performance dpeneds of workload (independed of RAM > > size): for some workloads wins, for some workloads lose (for my > > workload prefetch is lose and manualy disabled with 128GB RAM). > > > > Anyway, this system have only 24MB in ARC by 2.3GB free, this is may > > be too low for this workload. > You mean - "for getting a list of a directory with 20 subdirectories" ? > Why then does only this directory have this issue with pause, not > /usr/ports/..., which has more directories in it ? > > (and yes, /usr/ports/www isn't empty and holds 2410 entities) > > /usr/bin/time -h ls -1 /usr/ports/www > [...] > 0.14s real 0.00s user 0.00s sys You wrote: "(tens of thousands) files". In bad case metadata of every file will be placed in random place of disk. ls need access to metadata of every file before start of output listing. I.e. in bad case you will be need tens of thousands seeks over disk capable only 72 seeks per seconds. Perhaps /usr/ports/www created at once and metadata of all entries placed near each other, need less seeks. If zfs property primarycache/secondarycache not off. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs, a directory that used to hold lot of files and listing pause
On Fri, Oct 21, 2016 at 11:02:57AM +0100, Steven Hartland wrote: > > Mem: 21M Active, 646M Inact, 931M Wired, 2311M Free > > ARC: 73M Total, 3396K MFU, 21M MRU, 545K Anon, 1292K Header, 47M Other > > Swap: 4096M Total, 4096M Free > > > > PID USERNAME PRI NICE SIZERES STATE C TIMEWCPU COMMAND > > 600 root390 27564K 5072K nanslp 1 295.0H 24.56% monit > > 0 root -170 0K 2608K - 1 75:24 0.00% > > kernel{zio_write_issue} > > 767 freeswitch 200 139M 31668K uwait 0 48:29 0.00% > > freeswitch{freeswitch} > > 683 asterisk200 806M 483M uwait 0 41:09 0.00% > > asterisk{asterisk} > > 0 root-80 0K 2608K - 0 37:43 0.00% > > kernel{metaslab_group_t} > > [... others lines are just 0% ...] > This looks like you only have ~4Gb ram which is pretty low for ZFS I > suspect vfs.zfs.prefetch_disable will be 1, which will crash the > performance. ZFS prefetch affect performance dpeneds of workload (independed of RAM size): for some workloads wins, for some workloads lose (for my workload prefetch is lose and manualy disabled with 128GB RAM). Anyway, this system have only 24MB in ARC by 2.3GB free, this is may be too low for this workload. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: tcsh is not handled correctly UTF-8 in arguments
On Thu, Oct 20, 2016 at 08:54:05AM -0600, Alan Somers wrote: > On Wed, Oct 19, 2016 at 11:10 AM, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote: > > tcsh called by sshd for invocation of scp: `tcsh -c scp -f Расписание.pdf` > > At this time no any LC_* is set. > > tcsh read .cshrc and set LC_CTYPE=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8. > > After this invocation of scp will be incorrect: > > > > 7ab0 20 2d 66 20 c3 90 c2 a0 c3 90 c2 b0 c3 91 c2 81 | -f > > | > > 7ac0 c3 90 c2 bf c3 90 c2 b8 c3 91 c2 81 c3 90 c2 b0 > > || > > 7ad0 c3 90 c2 bd c3 90 c2 b8 c3 90 c2 b5 5f c3 90 c2 > > |_...| > > 7ae0 a2 c3 90 c2 97 c3 90 c2 98 2e 70 64 66 0a|..pdf. > > | > > > > Correct invocation must be: > > > > 20 2d 66 20 | > > -f | > > 0010 d0 a0 d0 b0 d1 81 d0 bf d0 b8 d1 81 d0 b0 d0 bd > > || > > 0020 d0 b8 d0 b5 5f d0 a2 d0 97 d0 98 2e 70 64 66 0a > > |_...pdf.| > > > > `d0` => `c3 90` > > `a0` => `c2 a0` > > > > I.e. every byte re-encoded to utf-8: `d0` => `c3 90` > > > > As result imposible to access files w/ non-ascii names. > > This might be related to PR213013. Could you please try on head after > r306782 ? I think not related. PR213013 is about character classification, my report is about unnecessary encoding shell arguments. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
tcsh is not handled correctly UTF-8 in arguments
tcsh called by sshd for invocation of scp: `tcsh -c scp -f Расписание.pdf` At this time no any LC_* is set. tcsh read .cshrc and set LC_CTYPE=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8. After this invocation of scp will be incorrect: 7ab0 20 2d 66 20 c3 90 c2 a0 c3 90 c2 b0 c3 91 c2 81 | -f | 7ac0 c3 90 c2 bf c3 90 c2 b8 c3 91 c2 81 c3 90 c2 b0 || 7ad0 c3 90 c2 bd c3 90 c2 b8 c3 90 c2 b5 5f c3 90 c2 |_...| 7ae0 a2 c3 90 c2 97 c3 90 c2 98 2e 70 64 66 0a|..pdf. | Correct invocation must be: 20 2d 66 20 | -f | 0010 d0 a0 d0 b0 d1 81 d0 bf d0 b8 d1 81 d0 b0 d0 bd || 0020 d0 b8 d0 b5 5f d0 a2 d0 97 d0 98 2e 70 64 66 0a |_...pdf.| `d0` => `c3 90` `a0` => `c2 a0` I.e. every byte re-encoded to utf-8: `d0` => `c3 90` As result imposible to access files w/ non-ascii names. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
stable/11: lock contention on zone_fetch_slab
@ CPU_CLK_UNHALTED_CORE [271718 samples] 22.48% [61081]lock_delay @ /boot/kernel/kernel 99.72% [60908] __mtx_lock_sleep 67.69% [41230] zone_fetch_slab 100.0% [41230] zone_import 100.0% [41230]zone_alloc_item 99.99% [41226] uma_zalloc_arg 74.55% [30732] m_getm2 100.0% [30732] m_uiotombuf 100.0% [30732]sosend_generic 99.93% [30711] soo_write 100.0% [30711] dofilewrite 100.0% [30711] kern_writev 99.37% [30518]sys_writev 100.0% [30518] amd64_syscall 00.63% [193] sys_write 100.0% [193] amd64_syscall 00.07% [21]kern_sendit 100.0% [21] sendit 100.0% [21] sys_sendto 100.0% [21] amd64_syscall 20.84% [8590] m_copym 100.0% [8590]tcp_output 96.18% [8262] tcp_usr_send 100.0% [8262] sosend_generic 99.67% [8235] soo_write 100.0% [8235]dofilewrite 100.0% [8235] kern_writev 99.38% [8184] sys_writev 100.0% [8184] amd64_syscall 00.62% [51]sys_write 100.0% [51] amd64_syscall 00.33% [27] kern_sendit 100.0% [27] sendit 100.0% [27] sys_sendto 100.0% [27]amd64_syscall 03.82% [328] tcp_timer_rexmt 100.0% [328] softclock_call_cc 100.0% [328]softclock 100.0% [328] intr_event_execute_handlers 100.0% [328] ithread_loop 100.0% [328] fork_exit 04.55% [1874] tcp_output 75.72% [1419]tcp_usr_send 100.0% [1419] sosend_generic 98.38% [1396] soo_write 100.0% [1396] dofilewrite 100.0% [1396]kern_writev 87.11% [1216] sys_writev 100.0% [1216] amd64_syscall 12.89% [180] sys_write 100.0% [180] amd64_syscall 01.62% [23]kern_sendit 100.0% [23] sendit 100.0% [23] sys_sendto 100.0% [23] amd64_syscall 13.07% [245] tcp_timer_rexmt 100.0% [245] softclock_call_cc 100.0% [245] softclock 100.0% [245]intr_event_execute_handlers 100.0% [245] ithread_loop 100.0% [245] fork_exit 06.46% [121] tcp_timer_delack 100.0% [121] softclock_call_cc 100.0% [121] softclock 100.0% [121]intr_event_execute_handlers 100.0% [121] ithread_loop 100.0% [121] fork_exit 02.99% [56] tcp_do_segment 100.0% [56] tcp_input 100.0% [56]ip_input 100.0% [56] swi_net 100.0% [56] intr_event_execute_handlers 100.0% [56] ithread_loop 100.0% [56]fork_exit 01.55% [29] tcp_usr_disconnect 100.0% [29] soclose 100.0% [29]_fdrop 100.0% [29] closef 100.0% [29] closefp 100.0% [29] amd64_syscall 00.11% [2] tcp_timer_persist 100.0% [2]softclock_call_cc 100.0% [2] softclock 100.0% [2] intr_event_execute_handlers 100.0% [2] ithread_loop 100.0% [2]fork_exit 00.11% [2] tcp_drop 100.0% [2]tcp_timer_rexmt 100.0% [2] softclock_call_cc 100.0% [2] softclock 100.0% [2] intr_event_execute_handlers 100.0% [2]ithread_loop 100.0% [2] fork_exit 00.07% [28] syncache_respond 100.0% [28] syncache_timer 100.0% [28]
Re: 11.0 stuck on high network load
On Fri, Oct 14, 2016 at 11:48:38AM +0200, Julien Charbon wrote: > >>> Also, using dtrace too complex in production (need complex startup > >>> under screen and capture output) and for many peoples. > >>> kdb_backtrace() have too less administrative overhead. > >> > >> I still think it is overkill. The main goal of this change is to fix a > >> quite tricky and old TCP stack locking issue. Let's try to do that > >> first, it is complex enough by itself. > >> > >> Once the fix is validated and pushed, feel free to propose your own > >> patch/review to add kdb_backtrace(), log(), etc.. to get other devs > >> point of view. > >> > >> I don't remember who said: "Never ever optimize error cases"... > > > > This is not optimeze error cases, this is error recovery and > > diagnostic of error cases in other subsystems. > > Sure, I guess this quote is more geared toward: "Always spend 50x more > time on improving the main path than the error path". > > > Currently FreeBSD internals too complex for just always trust on > > correct of other subsystem or do panic on any incosystency. > > > > INVARIANTS too expensive now (20Gbit drops to 8Gbits). > > I do agree. I am not expert enough to see all the side effects of > calling kdb_backtrace() from the TCP stack, might be way too slow, > tricky in interruption context, etc. You can see that kdb_backtrace() I think about this. This is example take from netgraph and this similar case (about interruption context and etc). Occurrence to rare (one per day, may be one per two hour) for any overhead. OK, I am see you point: you expirence don't allow to put this code and need separete review and commit. Right, np. > is rarely called in the kernel source. That's why it is better if you > propose a review on adding this line to get comments from other devs on > just this question. > > > PS: I am applay patch. Wait till monday. > > > > Thanks very match for this hard work! > > No problem, thanks for your time. But it is not over yet: We have to > wait for final test. Currently system don't use Chelsio TOE, after monday I am update system with Chelsio TOE. With chelsio I am see this occurrence very rare, one in few month. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Thu, Oct 13, 2016 at 06:14:29PM +0200, Julien Charbon wrote: > On 10/13/16 5:17 PM, Slawa Olhovchenkov wrote: > > On Thu, Oct 13, 2016 at 05:06:00PM +0200, Julien Charbon wrote: > > > >>>> will give you that trace in the core, and without INVARIANT then it is > >>>> better to use dtrace: > >>>> > >>>> $ cat tcp-twstart-dropped.d > >>>> fbt::tcp_twstart:entry > >>>> /args[0]->t_inpcb->inp_flags & 0x0400/ > >>>> { > >>>> stack(); > >>>> printf("INP_DROPPED in tcp_twstart: %x", args[0]->t_inpcb->inp_flags); > >>>> } > >>> > >>> Same code may be insert there too, IMHO. > >> > >> Hmm, I don't think so: > >> > >> - If you have INVARIANT, the kernel will panic in tcp_twstart() or > >> tcp_detach() and you will have everything you need to debug. > >> - If you don't, dtrace is the right tool to use in all cases anyway. > > > > dtrace don't executed in may case w/ diagnostic "dtrace: processing > > aborted: Abort due to systemic unresponsiveness". This is for > > tcp_close. May be tcp_twstart will be more successuful, may be not. > > It does and will. > > > Also, using dtrace too complex in production (need complex startup > > under screen and capture output) and for many peoples. > > kdb_backtrace() have too less administrative overhead. > > I still think it is overkill. The main goal of this change is to fix a > quite tricky and old TCP stack locking issue. Let's try to do that > first, it is complex enough by itself. > > Once the fix is validated and pushed, feel free to propose your own > patch/review to add kdb_backtrace(), log(), etc.. to get other devs > point of view. > > I don't remember who said: "Never ever optimize error cases"... This is not optimeze error cases, this is error recovery and diagnostic of error cases in other subsystems. Currently FreeBSD internals too complex for just always trust on correct of other subsystem or do panic on any incosystency. INVARIANTS too expensive now (20Gbit drops to 8Gbits). PS: I am applay patch. Wait till monday. Thanks very match for this hard work! ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Thu, Oct 13, 2016 at 05:06:00PM +0200, Julien Charbon wrote: > >> will give you that trace in the core, and without INVARIANT then it is > >> better to use dtrace: > >> > >> $ cat tcp-twstart-dropped.d > >> fbt::tcp_twstart:entry > >> /args[0]->t_inpcb->inp_flags & 0x0400/ > >> { > >> stack(); > >> printf("INP_DROPPED in tcp_twstart: %x", args[0]->t_inpcb->inp_flags); > >> } > > > > Same code may be insert there too, IMHO. > > Hmm, I don't think so: > > - If you have INVARIANT, the kernel will panic in tcp_twstart() or > tcp_detach() and you will have everything you need to debug. > - If you don't, dtrace is the right tool to use in all cases anyway. dtrace don't executed in may case w/ diagnostic "dtrace: processing aborted: Abort due to systemic unresponsiveness". This is for tcp_close. May be tcp_twstart will be more successuful, may be not. Also, using dtrace too complex in production (need complex startup under screen and capture output) and for many peoples. kdb_backtrace() have too less administrative overhead. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Thu, Oct 13, 2016 at 01:56:21PM +0200, Julien Charbon wrote: > >> Something like: > > > > Yes, thanks! > > Proposed changes added in the review: > > https://reviews.freebsd.org/D8211 > > tell me when you have three days without issue with this change. > > >> tcp_detach() { > >> > >> ... > >> if (inp->inp_flags & INP_TIMEWAIT) { > >> > >> ... > >> if (inp->inp_flags & INP_DROPPED) { > >> > >> in_pcbdetach(inp); > >> if (__predict_true(tp == NULL)) { > >> in_pcbfree(inp); > >> } else { > >> #ifdef INVARIANTS > >> panic("tcp_detach: tp != NULL, That's not good because 'blah'\n"); > >> #else > >> log(LOG_ERR, "tcp_detach: tp != NULL, That's not good because > >> 'blah'\n"); > > > > May be some more info in log can help to detect root cause of issuse? > > I am don't know what info, may be flags or number of references? > > For this kind of issue, the useful part is the stacktrace. INVARIANT Like this? #ifdef KDB kdb_backtrace(); #endif as found in sys/netgraph/ng_base.c > will give you that trace in the core, and without INVARIANT then it is > better to use dtrace: > > $ cat tcp-twstart-dropped.d > fbt::tcp_twstart:entry > /args[0]->t_inpcb->inp_flags & 0x0400/ > { > stack(); > printf("INP_DROPPED in tcp_twstart: %x", args[0]->t_inpcb->inp_flags); > } Same code may be insert there too, IMHO. > -- > Julien > > ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Oct 12, 2016 at 05:17:35PM +0200, Julien Charbon wrote: > I see, thus just for the context: The TCP stack in sys/dev/cxgb* > is a > TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a > separate/side TCP stack that is used only with TCP_OFFLOAD option. > > This TOE TCP stack actually has its own set of detach()/input() > functions and seems to check INP_DROPPED flag properly. I guess @np > check fixes in socket TCP stack and decides which one can also impact > the Chelsio TOE TCP stack. Some bugs are only in socket TCP stack, > some > are only in TOE TCP stack. > >>> > >>> I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE > >>> TCP stack and impact this to > >>> tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path. > >> > >> I see, I expect no problem on this side as tcp_timer_2msl() checks the > >> INP_TIMEWAIT flag and do not call tcp_close() if set. > > > > I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl() > > don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK() > > and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set > > INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected > > INP_TIMEWAIT in tcp_usr_detach(). > > Sure, basically the same bug that in classic TCP stack. If you think > it can happen, send an email describing that to np@ and he will check > and fix that. He is a TOE TCP stack expert and I am not. In all cases, > if this issue is possible in TOE TCP stack context, the patch will be > straightforward: If the INP_DROPPED flag is set do not call > tcp_twstart(). I am email to np@ > The current patch focuses only on the classic TCP stack. > >>> > >>> May be current workaround (with logging) in tcp_usr_detach() is good > >>> solutuion for preventing system lockout by similar bugs? > >> > >> Good question, the quick workaround in tcp_usr_detach() does not handle > >> all the cases. If it reduces the number of crashes you can still find > >> scenarios where it can have unexpected side effect. > > > > This is best then guaranted lockout. > > > >> Long term solution is to enforce: If the inp has the INP_DROPPED flag > >> just stop processing it and return. If you grep the INP_DROPPED flag in > >> kernel sources, you can see that this test is already done in almost all > >> tcp_*() processing functions but tcp_input(). > >> > >> I would say that even without this issue tcp_input() should check > >> INP_DROPPED flags after INP_WLOCK anyway. Same for the TOE TCP stack, > >> you are simply not supposed to process a inp with INP_DROPPED flag. > > > > Absolutly acceptant! > > May point is: more check and good handling of check result is best for > > stability. > > > > I.e. AND check INP_DROPPED in tcp_input AND workaroud INP_TIMEWAIT in > > tcp_usr_detach (with logging) and check of some posible cases in XXX TOE. > > > > Current TCP stack too complex and have many corner cases. This is need > > additional guards where posible (not caused kernel panic). > > I see your point: Even if this issue is caught by this assert: > > KASSERT(tp == NULL, ("tcp_detach: INP_TIMEWAIT && " > "INP_DROPPED && tp != NULL")); > https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_usrreq.c#L213 > > you might not have INVARIANT option, then you will get a lockout quite > difficult to debug. Thus what we can do is: > > - If INVARIANT is set: kernel panic to get all the details in the core. > - If INVARIANT is not set: Log this error with an explicit kernel > log(LOG_ERR) describing the issue, and then use the workaround to avoid > the double-free to let the system to good enough state. > > Something like: Yes, thanks! > tcp_detach() { > > ... > if (inp->inp_flags & INP_TIMEWAIT) { > > ... > if (inp->inp_flags & INP_DROPPED) { > > in_pcbdetach(inp); > if (__predict_true(tp == NULL)) { > in_pcbfree(inp); > } else { > #ifdef INVARIANTS > panic("tcp_detach: tp != NULL, That's not good because 'blah'\n"); > #else > log(LOG_ERR, "tcp_detach: tp != NULL, That's not good because > 'blah'\n"); May be some more info in log can help to detect root cause of issuse? I am don't know what info, may be flags or number of references? > #endif > INP_WUNLOCK(inp); > } > } > } > > ... > > } > > -- > Julien > ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Oct 12, 2016 at 02:35:11PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 10/12/16 2:13 PM, Slawa Olhovchenkov wrote: > > On Wed, Oct 12, 2016 at 02:06:59PM +0200, Julien Charbon wrote: > >>>>>>> sofree() call tcp_usr_detach() and in tcp_usr_detach() we have > >>>>>>> unexpected INP_TIMEWAIT. > >>>>>> > >>>>>> I see, thus just for the context: The TCP stack in sys/dev/cxgb* is a > >>>>>> TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a > >>>>>> separate/side TCP stack that is used only with TCP_OFFLOAD option. > >>>>>> > >>>>>> This TOE TCP stack actually has its own set of detach()/input() > >>>>>> functions and seems to check INP_DROPPED flag properly. I guess @np > >>>>>> check fixes in socket TCP stack and decides which one can also impact > >>>>>> the Chelsio TOE TCP stack. Some bugs are only in socket TCP stack, > >>>>>> some > >>>>>> are only in TOE TCP stack. > >>>>> > >>>>> I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE > >>>>> TCP stack and impact this to > >>>>> tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path. > >>>> > >>>> I see, I expect no problem on this side as tcp_timer_2msl() checks the > >>>> INP_TIMEWAIT flag and do not call tcp_close() if set. > >>> > >>> I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl() > >>> don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK() > >>> and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set > >>> INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected > >>> INP_TIMEWAIT in tcp_usr_detach(). > >> > >> Sure, basically the same bug that in classic TCP stack. If you think > >> it can happen, send an email describing that to np@ and he will check > >> and fix that. He is a TOE TCP stack expert and I am not. In all cases, > >> if this issue is possible in TOE TCP stack context, the patch will be > >> straightforward: If the INP_DROPPED flag is set do not call tcp_twstart(). > >> > >> The current patch focuses only on the classic TCP stack. > > > > May be current workaround (with logging) in tcp_usr_detach() is good > > solutuion for preventing system lockout by similar bugs? > > Good question, the quick workaround in tcp_usr_detach() does not handle > all the cases. If it reduces the number of crashes you can still find > scenarios where it can have unexpected side effect. This is best then guaranted lockout. > Long term solution is to enforce: If the inp has the INP_DROPPED flag > just stop processing it and return. If you grep the INP_DROPPED flag in > kernel sources, you can see that this test is already done in almost all > tcp_*() processing functions but tcp_input(). > > I would say that even without this issue tcp_input() should check > INP_DROPPED flags after INP_WLOCK anyway. Same for the TOE TCP stack, > you are simply not supposed to process a inp with INP_DROPPED flag. Absolutly acceptant! May point is: more check and good handling of check result is best for stability. I.e. AND check INP_DROPPED in tcp_input AND workaroud INP_TIMEWAIT in tcp_usr_detach (with logging) and check of some posible cases in XXX TOE. Current TCP stack too complex and have many corner cases. This is need additional guards where posible (not caused kernel panic). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Oct 12, 2016 at 02:06:59PM +0200, Julien Charbon wrote: > > sofree() call tcp_usr_detach() and in tcp_usr_detach() we have > > unexpected INP_TIMEWAIT. > > I see, thus just for the context: The TCP stack in sys/dev/cxgb* is a > TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a > separate/side TCP stack that is used only with TCP_OFFLOAD option. > > This TOE TCP stack actually has its own set of detach()/input() > functions and seems to check INP_DROPPED flag properly. I guess @np > check fixes in socket TCP stack and decides which one can also impact > the Chelsio TOE TCP stack. Some bugs are only in socket TCP stack, some > are only in TOE TCP stack. > >>> > >>> I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE > >>> TCP stack and impact this to > >>> tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path. > >> > >> I see, I expect no problem on this side as tcp_timer_2msl() checks the > >> INP_TIMEWAIT flag and do not call tcp_close() if set. > > > > I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl() > > don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK() > > and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set > > INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected > > INP_TIMEWAIT in tcp_usr_detach(). > > Sure, basically the same bug that in classic TCP stack. If you think > it can happen, send an email describing that to np@ and he will check > and fix that. He is a TOE TCP stack expert and I am not. In all cases, > if this issue is possible in TOE TCP stack context, the patch will be > straightforward: If the INP_DROPPED flag is set do not call tcp_twstart(). > > The current patch focuses only on the classic TCP stack. May be current workaround (with logging) in tcp_usr_detach() is good solutuion for preventing system lockout by similar bugs? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Oct 12, 2016 at 11:42:38AM +0200, Julien Charbon wrote: > On 10/12/16 11:29 AM, Slawa Olhovchenkov wrote: > > On Wed, Oct 12, 2016 at 11:19:48AM +0200, Julien Charbon wrote: > > > >>> if INP_WLOCK is like spinlock -- this is dead lock. > >>> if INP_WLOCK is like mutex -- thread1 resheduled. > >> > >> Thanks, I understand you question now. No an interrupt cannot bypass a > >> lock: Here INP_WLOCK is like mutex -- thread1 resheduled. > > > > Thanks, nice. > > > >>>>> As I remeber race created by call tcp_twstart() at time of end > >>>>> tcp_close(), at path sofree()-tcp_usr_detach() and unexpected > >>>>> INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in > >>>>> tcp_twstart() > >>>> > >>>> Exactly, thus the current fix is: If you already have the INP_DROPPED > >>>> flag set you are not allowed to call tcp_twstart(), actually it is a > >>>> good candidate for a new INVARIANT. Let me add that. > >>>> > >>>>> After check source code I am found invocation of tcp_twstart() in > >>>>> sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c, > >>>>> sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c. > >>>>> > >>>>> Invocation from sys/netinet/tcp_stacks/fastpath.c and > >>>>> sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now > >>>>> will be OK. > >>>>> > >>>>> Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and > >>>>> sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed > >>>>> INP_WLOCK. Is this OK? > >>>>> > >>>>> Can be thread A wants do_peer_close() directed from chelsio IRQ > >>>>> handler, bypass tcp_input()? > >>>> > >>>> If you look carefully INP_WLOCK is used in cxgb_cpl_io.c and > >>>> t4_cpl_io.c before calling tcp_twstart(). > >>> > >>> Yes, and you remeber: sys/netinet/tcp_subr.c > >>> > >>> 1535 struct tcpcb * > >>> 1536 tcp_close(struct tcpcb *tp) > >>> 1537 { > >>> ... > >>> 1569 INP_WUNLOCK(inp); > >>> 1570 ACCEPT_LOCK(); > >>> 1571 SOCK_LOCK(so); > >>> 1572 so->so_state &= ~SS_PROTOREF; > >>> 1573 sofree(so); > >>> 1574 return (NULL); > >>> > >>> sofree() call tcp_usr_detach() and in tcp_usr_detach() we have > >>> unexpected INP_TIMEWAIT. > >> > >> I see, thus just for the context: The TCP stack in sys/dev/cxgb* is a > >> TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a > >> separate/side TCP stack that is used only with TCP_OFFLOAD option. > >> > >> This TOE TCP stack actually has its own set of detach()/input() > >> functions and seems to check INP_DROPPED flag properly. I guess @np > >> check fixes in socket TCP stack and decides which one can also impact > >> the Chelsio TOE TCP stack. Some bugs are only in socket TCP stack, some > >> are only in TOE TCP stack. > > > > I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE > > TCP stack and impact this to > > tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path. > > I see, I expect no problem on this side as tcp_timer_2msl() checks the > INP_TIMEWAIT flag and do not call tcp_close() if set. I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl() don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK() and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected INP_TIMEWAIT in tcp_usr_detach(). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Oct 12, 2016 at 11:19:48AM +0200, Julien Charbon wrote: > > if INP_WLOCK is like spinlock -- this is dead lock. > > if INP_WLOCK is like mutex -- thread1 resheduled. > > Thanks, I understand you question now. No an interrupt cannot bypass a > lock: Here INP_WLOCK is like mutex -- thread1 resheduled. Thanks, nice. > >>> As I remeber race created by call tcp_twstart() at time of end > >>> tcp_close(), at path sofree()-tcp_usr_detach() and unexpected > >>> INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in > >>> tcp_twstart() > >> > >> Exactly, thus the current fix is: If you already have the INP_DROPPED > >> flag set you are not allowed to call tcp_twstart(), actually it is a > >> good candidate for a new INVARIANT. Let me add that. > >> > >>> After check source code I am found invocation of tcp_twstart() in > >>> sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c, > >>> sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c. > >>> > >>> Invocation from sys/netinet/tcp_stacks/fastpath.c and > >>> sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now > >>> will be OK. > >>> > >>> Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and > >>> sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed > >>> INP_WLOCK. Is this OK? > >>> > >>> Can be thread A wants do_peer_close() directed from chelsio IRQ > >>> handler, bypass tcp_input()? > >> > >> If you look carefully INP_WLOCK is used in cxgb_cpl_io.c and > >> t4_cpl_io.c before calling tcp_twstart(). > > > > Yes, and you remeber: sys/netinet/tcp_subr.c > > > > 1535 struct tcpcb * > > 1536 tcp_close(struct tcpcb *tp) > > 1537 { > > ... > > 1569 INP_WUNLOCK(inp); > > 1570 ACCEPT_LOCK(); > > 1571 SOCK_LOCK(so); > > 1572 so->so_state &= ~SS_PROTOREF; > > 1573 sofree(so); > > 1574 return (NULL); > > > > sofree() call tcp_usr_detach() and in tcp_usr_detach() we have > > unexpected INP_TIMEWAIT. > > I see, thus just for the context: The TCP stack in sys/dev/cxgb* is a > TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a > separate/side TCP stack that is used only with TCP_OFFLOAD option. > > This TOE TCP stack actually has its own set of detach()/input() > functions and seems to check INP_DROPPED flag properly. I guess @np > check fixes in socket TCP stack and decides which one can also impact > the Chelsio TOE TCP stack. Some bugs are only in socket TCP stack, some > are only in TOE TCP stack. I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE TCP stack and impact this to tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Oct 12, 2016 at 10:18:18AM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 10/11/16 2:11 PM, Slawa Olhovchenkov wrote: > > On Tue, Oct 11, 2016 at 09:20:17AM +0200, Julien Charbon wrote: > >> Then threads are competing for the INP_WLOCK lock. For the example, > >> let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and > >> racing for this INP_WLOCK: > >> > >> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1964 > >> > >> And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and > >> racing for this INP_WLOCK: > >> > >> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_timer.c#L323 > >> > >> That leads to two cases: > >> > >> o Thread A wins the race: > >> > >> Thread A will continue tcp_input() as usal and INP_DROPPED flags is > >> not set and inp is still in TCP hash table. > >> Thread B is waiting on thread A to release INP_WLOCK after finishing > >> tcp_input() processing, and thread B will continue > >> tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing. > >> > >> o Thread B wins the race: > >> > >> Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp > >> INP_DROPPED is set and inp being removed from TCP hash table. > >> In parallel, thread A has found the inp in TCP hash before is was > >> removed, and waiting on the found inp INP_WLOCK lock. > >> Once thread B has released the INP_WLOCK lock, thread A gets this lock > >> and sees the INP_DROPPED flag and do "goto findpcb" but here because the > >> inp is not more in TCP hash table and it will not be find again by > >> in_pcblookup_mbuf(). > >> > >> Hopefully I am clear enough here. > > > > Thanks, very clear. > > Small qeustion: when both thread run on same CPU core, INP_WLOCK will > > be re-schedule? > > Hmm, a thread can re-scheduled but not a lock. Thus no sure I > understand your question here. :) I am don't know how work INP_WLOCK in this case (all on same cpu): thread1: INP_WLOCK -interrupt-- thread2: INP_WLOCK if INP_WLOCK is like spinlock -- this is dead lock. if INP_WLOCK is like mutex -- thread1 resheduled. > > As I remeber race created by call tcp_twstart() at time of end > > tcp_close(), at path sofree()-tcp_usr_detach() and unexpected > > INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in > > tcp_twstart() > > Exactly, thus the current fix is: If you already have the INP_DROPPED > flag set you are not allowed to call tcp_twstart(), actually it is a > good candidate for a new INVARIANT. Let me add that. > > > After check source code I am found invocation of tcp_twstart() in > > sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c, > > sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c. > > > > Invocation from sys/netinet/tcp_stacks/fastpath.c and > > sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now > > will be OK. > > > > Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and > > sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed > > INP_WLOCK. Is this OK? > > > > Can be thread A wants do_peer_close() directed from chelsio IRQ > > handler, bypass tcp_input()? > > If you look carefully INP_WLOCK is used in cxgb_cpl_io.c and > t4_cpl_io.c before calling tcp_twstart(). Yes, and you remeber: sys/netinet/tcp_subr.c 1535 struct tcpcb * 1536 tcp_close(struct tcpcb *tp) 1537 { ... 1569 INP_WUNLOCK(inp); 1570 ACCEPT_LOCK(); 1571 SOCK_LOCK(so); 1572 so->so_state &= ~SS_PROTOREF; 1573 sofree(so); 1574 return (NULL); sofree() call tcp_usr_detach() and in tcp_usr_detach() we have unexpected INP_TIMEWAIT. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Tue, Oct 11, 2016 at 09:20:17AM +0200, Julien Charbon wrote: > Then threads are competing for the INP_WLOCK lock. For the example, > let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and > racing for this INP_WLOCK: > > https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1964 > > And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and > racing for this INP_WLOCK: > > https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_timer.c#L323 > > That leads to two cases: > > o Thread A wins the race: > > Thread A will continue tcp_input() as usal and INP_DROPPED flags is > not set and inp is still in TCP hash table. > Thread B is waiting on thread A to release INP_WLOCK after finishing > tcp_input() processing, and thread B will continue > tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing. > > o Thread B wins the race: > > Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp > INP_DROPPED is set and inp being removed from TCP hash table. > In parallel, thread A has found the inp in TCP hash before is was > removed, and waiting on the found inp INP_WLOCK lock. > Once thread B has released the INP_WLOCK lock, thread A gets this lock > and sees the INP_DROPPED flag and do "goto findpcb" but here because the > inp is not more in TCP hash table and it will not be find again by > in_pcblookup_mbuf(). > > Hopefully I am clear enough here. Thanks, very clear. Small qeustion: when both thread run on same CPU core, INP_WLOCK will be re-schedule? As I remeber race created by call tcp_twstart() at time of end tcp_close(), at path sofree()-tcp_usr_detach() and unexpected INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in tcp_twstart() After check source code I am found invocation of tcp_twstart() in sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c, sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c. Invocation from sys/netinet/tcp_stacks/fastpath.c and sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now will be OK. Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed INP_WLOCK. Is this OK? Can be thread A wants do_peer_close() directed from chelsio IRQ handler, bypass tcp_input()? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Oct 10, 2016 at 05:44:21PM +0200, Julien Charbon wrote: > >> can check the current other usages of goto findpcb in tcp_input(). The > >> rational here being: > >> > >> - Behavior before the patch: If the inp we found was deleted then goto > >> findpcb. > >> - Behavior after the patch: If the inp we found was deleted or dropped > >> then goto findpcb. > >> > >> I just prefer having the same behavior applied everywhere: If > >> tcp_input() loses the inp lock race and the inp was deleted or dropped > >> then retry to find a new inpcb to deliver to. > >> > >> But you are right dropping the packet here will also fix the issue. > >> > >> Then the review process becomes quite helpful because people can argue: > >> Dropping here is better because "blah", or goto findpcb is better > >> because "bluh", etc. And at the review end you have a nice final patch. > >> > >> https://reviews.freebsd.org/D8211 > > > > I am not sure, I am see to > > > > sys/netinet/in_pcb.h:#defineINP_DROPPED 0x0400 /* > protocol drop flag */ > > > > and think this is a flag 'all packets must be droped' > > Hm, I believe this flag means "this inp has been dropped by the TCP > stack, so don't use it anymore". Actually this flag is better described > in the function that sets it: > > "(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid > future packet delivery or event notification when a socket remains open > but TCP has closed." > > https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1320 > > /* > * in_pcbdrop() removes an inpcb from hashed lists, releasing its > address and > * port reservation, and preventing it from being returned by inpcb lookups. > * > * It is used by TCP to mark an inpcb as unused and avoid future packet > * delivery or event notification when a socket remains open but TCP has > * closed. This might occur as a result of a shutdown()-initiated TCP close > * or a RST on the wire, and allows the port binding to be reused while > still > * maintaining the invariant that so_pcb always points to a valid inpcb > until > * in_pcbdetach(). > * > */ > void > in_pcbdrop(struct inpcb *inp) > { > inp->inp_flags |= INP_DROPPED; > ... > > The classical example where "goto findpcb" is useful: You receive a > new connection request with a TCP SYN packet and this packet is unlucky > and reached a inp being dropped: > > - with "goto findpcb" approach, the next lookup will most likely find > the LISTEN inp and start the TCP hand-shake as usual > - with "drop the packet" approach, the TCP client will need to > re-transmit a TCP SYN packet > > It is not because a packet was unlucky once that it deserves to be > dropped. :) Thanks for explaining, very helpfull. In this situation (TCP SYN with same 4-tuple as existing socket) allocate new PCB is best. But for this we must destroy current PCB. I am think INP_WUNLOCK(inp) don't destroy it and in_pcblookup_mbuf find it again (I am think in_pcblookup_mbuf find this PCB on first turn). I am assume for classical example in_pcbrele_wlocked(inp) free and destroy current PCB for possibility in_pcblookup_mbuf allocate new one. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote: > > On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: > >> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: > >>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > >>> > >>>> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the > >>>> process continues and calls INP_WUNLOCK() here: > >>>> > >>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 > >>> > >>> Look also to sys/netinet/tcp_timewait.c:488 > >>> > >>> And check other locks from r160549 > >> > >> You are right, and here the a fix proposal for this issue: > >> > >> Fix a double-free when an inp transitions to INP_TIMEWAIT state after > >> having been dropped > >> https://reviews.freebsd.org/D8211 > >> > >> It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPPED > >> inpcb should never be proceed further. > >> > >> Slawa, as you are the only one to reproduce this issue currently, could > >> test this patch? (And remove the temporary patch I did provided to you > >> before). > >> > >> I will wait for your tests results before pushing further. > >> > >> Thanks! > >> > >> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c > >> index c72f01f..37f27e0 100644 > >> --- a/sys/netinet/tcp_input.c > >> +++ b/sys/netinet/tcp_input.c > >> @@ -921,6 +921,16 @@ findpcb: > >> goto dropwithreset; > >> } > >> INP_WLOCK_ASSERT(inp); > >> + /* > >> +* While waiting for inp lock during the lookup, another thread > >> +* can have droppedt the inpcb, in which case we need to loop back > >> +* and try to find a new inpcb to deliver to. > >> +*/ > >> + if (inp->inp_flags & INP_DROPPED) { > >> + INP_WUNLOCK(inp); > >> + inp = NULL; > >> + goto findpcb; > > > > Are you sure about this goto? > > Can this cause infinite loop by found same inpcb? > > May be drop packet is more correct? > > Good question: Infinite loop is not possible here, as the next TCP > hash lookup will return NULL or a fresh new and not dropped inp. You I am not expert in this api and don't see cause of this: I am assume hash lookup don't remove from hash returned args and I am don't see any removing of this inp. Why hash lookup don't return same inp? (assume this input patch interrupt callout code on the same CPU core). > can check the current other usages of goto findpcb in tcp_input(). The > rational here being: > > - Behavior before the patch: If the inp we found was deleted then goto > findpcb. > - Behavior after the patch: If the inp we found was deleted or dropped > then goto findpcb. > > I just prefer having the same behavior applied everywhere: If > tcp_input() loses the inp lock race and the inp was deleted or dropped > then retry to find a new inpcb to deliver to. > > But you are right dropping the packet here will also fix the issue. > > Then the review process becomes quite helpful because people can argue: > Dropping here is better because "blah", or goto findpcb is better > because "bluh", etc. And at the review end you have a nice final patch. > > https://reviews.freebsd.org/D8211 I am not sure, I am see to sys/netinet/in_pcb.h:#defineINP_DROPPED 0x0400 /* protocol drop flag */ and think this is a flag 'all packets must be droped' ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: > > Hi, > > On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: > > On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > > > >> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the > >> process continues and calls INP_WUNLOCK() here: > >> > >> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 > > > > Look also to sys/netinet/tcp_timewait.c:488 > > > > And check other locks from r160549 > > You are right, and here the a fix proposal for this issue: > > Fix a double-free when an inp transitions to INP_TIMEWAIT state after > having been dropped > https://reviews.freebsd.org/D8211 > > It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPPED > inpcb should never be proceed further. > > Slawa, as you are the only one to reproduce this issue currently, could > test this patch? (And remove the temporary patch I did provided to you > before). > > I will wait for your tests results before pushing further. > > Thanks! > > diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c > index c72f01f..37f27e0 100644 > --- a/sys/netinet/tcp_input.c > +++ b/sys/netinet/tcp_input.c > @@ -921,6 +921,16 @@ findpcb: > goto dropwithreset; > } > INP_WLOCK_ASSERT(inp); > + /* > +* While waiting for inp lock during the lookup, another thread > +* can have droppedt the inpcb, in which case we need to loop back > +* and try to find a new inpcb to deliver to. > +*/ > + if (inp->inp_flags & INP_DROPPED) { > + INP_WUNLOCK(inp); > + inp = NULL; > + goto findpcb; Are you sure about this goto? Can this cause infinite loop by found same inpcb? May be drop packet is more correct? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: > > Hi, > > On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: > > On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > > > >> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the > >> process continues and calls INP_WUNLOCK() here: > >> > >> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 > > > > Look also to sys/netinet/tcp_timewait.c:488 > > > > And check other locks from r160549 > > You are right, and here the a fix proposal for this issue: > > Fix a double-free when an inp transitions to INP_TIMEWAIT state after > having been dropped > https://reviews.freebsd.org/D8211 > > It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPPED > inpcb should never be proceed further. > > Slawa, as you are the only one to reproduce this issue currently, could > test this patch? (And remove the temporary patch I did provided to you > before). > > I will wait for your tests results before pushing further. OK, I am will try it tomorrow Thanks! > Thanks! > > diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c > index c72f01f..37f27e0 100644 > --- a/sys/netinet/tcp_input.c > +++ b/sys/netinet/tcp_input.c > @@ -921,6 +921,16 @@ findpcb: > goto dropwithreset; > } > INP_WLOCK_ASSERT(inp); > + /* > +* While waiting for inp lock during the lookup, another thread > +* can have droppedt the inpcb, in which case we need to loop back > +* and try to find a new inpcb to deliver to. > +*/ > + if (inp->inp_flags & INP_DROPPED) { > + INP_WUNLOCK(inp); > + inp = NULL; > + goto findpcb; > + } > if ((inp->inp_flowtype == M_HASHTYPE_NONE) && > (M_HASHTYPE_GET(m) != M_HASHTYPE_NONE) && > ((inp->inp_socket == NULL) || > @@ -981,6 +991,10 @@ relocked: > if (in_pcbrele_wlocked(inp)) { > inp = NULL; > goto findpcb; > + } else if (inp->inp_flags & INP_DROPPED) { > + INP_WUNLOCK(inp); > + inp = NULL; > + goto findpcb; > } > } else > ti_locked = TI_RLOCKED; > @@ -1040,6 +1054,10 @@ relocked: > if (in_pcbrele_wlocked(inp)) { > inp = NULL; > goto findpcb; > + } else if (inp->inp_flags & INP_DROPPED) { > + INP_WUNLOCK(inp); > + inp = NULL; > + goto findpcb; > } > goto relocked; > } else > > -- > Julien > ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
11.0-RELEASE and mbuf-related trace
Has anybody comment on this? During debug tcp-related freeze I am collect starnge mbuf-related freeze (this is like recursive lock to UMA Slabs keg) and trace: last pid: 49575; load averages: 2.00, 2.05, 3.75up 1+01:12:08 22:13:42 853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock CPU 0: 0.0% user, 0.0% nice, 0.0% system, 100% interrupt, 0.0% idle CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 8: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 9: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 10: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle CPU 11: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 8659M Active, 8385M Inact, 107G Wired, 1325M Free ARC: 99G Total, 88G MFU, 10G MRU, 32K Anon, 167M Header, 529M Other Swap: 32G Total, 32G Free # procstat -k -k 1046 PIDTID COMM TDNAME KSTACK 1046 100686 nginx-mi_switch+0xd2 critical_exit+0x7e lapic_handle_timer+0xb1 Xtimerint+0x8c __mtx_lock_sleep+0x168 zone_fetch_slab+0x47 zone_import+0x52 zone_alloc_item+0x36 keg_alloc_slab+0x63 keg_fetch_slab+0x16e zone_fetch_slab+0x6e zone_import+0x52 uma_zalloc_arg+0x36e m_getm2+0x14f m_uiotombuf+0x64 sosend_generic+0x356 soo_write+0x42 dofilewrite+0x87 Some info below posible incorectly decoded. (kgdb) thread 809 [Switching to thread 809 (Thread 100686)] (kgdb) bt #0 sched_switch (td=0xf8014485f500, newtd=0xf8011422b000, flags=) at /usr/src/sys/kern/sched_ule.c:1973 #1 0x804a8d92 in mi_switch (flags=, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0x804a6bee in critical_exit () at /usr/src/sys/kern/kern_switch.c:218 #3 0x80771701 in lapic_handle_timer (frame=0xfe2021699340) at /usr/src/sys/x86/x86/local_apic.c:1184 #4 #5 0x804de424 in lock_delay (la=) at /usr/src/sys/kern/subr_lock.c:127 #6 0x80484dc8 in __mtx_lock_sleep (c=, tid=18446735283061126400, opts=, file=, line=) at /usr/src/sys/kern/kern_mutex.c:514 #7 0x806a4257 in zone_fetch_slab (zone=0xf8207ffe6000, keg=0xf8207ffe7180, flags=1) at /usr/src/sys/vm/uma_core.c:2371 #8 0x806a4312 in zone_import (zone=, bucket=, max=, flags=) at /usr/src/sys/vm/uma_core.c:2501 #9 0x806a0986 in zone_alloc_item (zone=0xf8207ffe6000, udata=0x0, flags=1) at /usr/src/sys/vm/uma_core.c:2591 #10 0x806a2463 in keg_alloc_slab (keg=0xf8010f9ecd80, zone=0xf80114236000, wait=1) at /usr/src/sys/vm/uma_core.c:964 #11 0x806a48ce in keg_fetch_slab (keg=, zone=, flags=) at /usr/src/sys/vm/uma_core.c:2343 #12 0x806a427e in zone_fetch_slab (zone=, keg=, flags=) at /usr/src/sys/vm/uma_core.c:2375 #13 0x806a4312 in zone_import (zone=, bucket=, max=, flags=) at /usr/src/sys/vm/uma_core.c:2501 #14 0x806a147e in zone_alloc_bucket (flags=2, zone=, udata=) at /usr/src/sys/vm/uma_core.c:2531 #15 uma_zalloc_arg (zone=, udata=0xf8105a700300, flags=2) at /usr/src/sys/vm/uma_core.c:2257 #16 0x8048231f in m_getjcl (how=2, type=1, flags=, size=4096) at /usr/src/sys/kern/kern_mbuf.c:829 #17 m_getm2 (m=, len=, how=, type=, flags=) at /usr/src/sys/kern/kern_mbuf.c:861 #18 0x80516044 in m_uiotombuf (uio=0xf818dcfbaec0, how=60, len=, align=0, flags=0) at /usr/src/sys/kern/uipc_mbuf.c:1535 #19 0x8051ce56 in sosend_generic (so=, addr=0x0, uio=, top=, control=, flags=, td=) at /usr/src/sys/kern/uipc_socket.c:1332 #20 0x804fd872 in soo_write (fp=, uio=0xf818dcfbaec0, active_cred=, flags=, td=) at /usr/src/sys/kern/sys_socket.c:146 #21 0x804f5c97 in fo_write (fp=, uio=0xf818dcfbaec0, active_cred=0xc0, flags=0, td=) at /usr/src/sys/sys/file.h:311 #22 dofilewrite (td=0xf8014485f500, fd=1531, fp=0xf80ac09fa960, auio=0xf818dcfbaec0, offset=, flags=0) at /usr/src/sys/kern/sys_generic.c:590 #23 0x804f5978 in kern_writev (td=0xf8014485f500, fd=1531, auio=0xf818dcfbaec0) at /usr/src/sys/kern/sys_generic.c:506 #24 0x804f5be6 in sys_writev (td=0xf8014485f500, uap=0xfe2021699a40) at /usr/src/sys/kern/sys_generic.c:491 #25 0x806e4051 in syscallenter (td=0xf8014485f500, sa=) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135 #26 amd64_syscall (td=0xf8014485f500, traced=0) at /usr/src/sys/amd64/amd64/trap.c:942 # vmstat -M /var/crash/vmcore.1 -z| grep -i mbuf mbuf_packet:256,
Re: 11.0 stuck on high network load
On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > Thanks again to Slawa, for his numerous debug reports and always > questioning my explanations. His last question directly led to this > finding. He is testing a quick workaround patch to check if there is more. Thanks very match! You was very helpful, explaining detail of FreeBSD TCP code and gave a lot of work to this issuse, I'm appreciate all your help! Thanks again! ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the > process continues and calls INP_WUNLOCK() here: > > https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 Look also to sys/netinet/tcp_timewait.c:488 And check other locks from r160549 ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Sep 28, 2016 at 12:06:47PM +0200, Julien Charbon wrote: > > Tracing command intr pid 12 tid 100026 td 0xf8011424b500 > > sched_switch() at 0x804c956d = sched_switch+0x6ad/frame > > 0xfe3876f0 > > mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe387720 > > critical_exit() at 0x804a6bee = critical_exit+0x7e/frame > > 0xfe387740 > > ipi_bitmap_handler() at 0x80775629 = ipi_bitmap_handler+0x79/frame > > 0xfe387780 > > Xipi_intr_bitmap_handler() at 0x806cc15e = > > Xipi_intr_bitmap_handler+0x8e/frame 0xfe387780 > > --- interrupt, rip = 0x80484c1f, rsp = 0xfe387850, rbp = > > 0xfe387850 --- > > __mtx_lock_flags() at 0x80484c1f = __mtx_lock_flags+0x2f/frame > > 0xfe387850 > > sodealloc() at 0x8051b992 = sodealloc+0x32/frame 0xfe387890 > > tcp_close() at 0x80618150 = tcp_close+0xd0/frame 0xfe3878c0 > > tcp_timer_2msl() at 0x8061dda3 = tcp_timer_2msl+0x1f3/frame > > 0xfe3878f0 > > softclock_call_cc() at 0x804b4ca9 = softclock_call_cc+0x179/frame > > 0xfe3879c0 > > softclock() at 0x804b5034 = softclock+0x44/frame 0xfe3879e0 > > intr_event_execute_handlers() at 0x8046c605 = > > intr_event_execute_handlers+0x95/frame 0xfe387a20 > > ithread_loop() at 0x8046cc26 = ithread_loop+0xa6/frame > > 0xfe387a70 > > fork_exit() at 0x8046a211 = fork_exit+0x71/frame 0xfe387ab0 > > fork_trampoline() at 0x806cb50e = fork_trampoline+0xe/frame > > 0xfe387ab0 > > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Nice stack traces, all threads are blocked in sodealloc() or soalloc() > and if you look at how mtx_lock(_global_mtx) and > mtx_unlock(_global_mtx) are used, it is hard to think about a > scenario that can lead to this state. > > I am still trying to reproduce your issue, without success so far. May be some hardware-related (low-speed CPU?). Yesternight I am collect new stack traces and kernel dump. May be I can see something? db> ps pid ppid pgrp uid state wmesg wchancmd 12 0 0 0 RL (threaded) [intr] 100023 RunQ[swi4: clock (8)] 100107 Run CPU 8 [irq291: ix0:q2] 11 0 0 0 RL (threaded) [idle] 100011 CanRun [idle: cpu8] cpuid= 8 dynamic pcpu = 0xfe201d69cf00 curthread= 0xf8012508d500: pid 12 "irq291: ix0:q2" curpcb = 0xfe2020ebcb80 fpcurthread = none idlethread = 0xf8011422c500: tid 100011 "idle: cpu8" curpmap = 0x80d49998 tssp = 0x80d7fcd0 commontssp = 0x80d7fcd0 rsp0 = 0xfe2020ebcb80 gs32p= 0x80d86528 ldt = 0x80d86568 tss = 0x80d86558 Tracing command nginx pid 1061 tid 101747 td 0xf8014b35b500 sched_switch() at 0x804c956d = sched_switch+0x6ad/frame 0xfe2021b70330 mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe2021b70360 turnstile_wait() at 0x804ef177 = turnstile_wait+0x2a7/frame 0xfe2021b703a0 __rw_wlock_hard() at 0x8049c314 = __rw_wlock_hard+0x94/frame 0xfe2021b70430 in_lltable_lookup() at 0x80594823 = in_lltable_lookup+0x83/frame 0xfe2021b70450 arpresolve() at 0x8058d2aa = arpresolve+0x9a/frame 0xfe2021b704b0 ether_output() at 0x805755e2 = ether_output+0x2f2/frame 0xfe2021b70550 ip_output() at 0x805a4200 = ip_output+0x1390/frame 0xfe2021b706b0 tcp_output() at 0x806149d5 = tcp_output+0x17a5/frame 0xfe2021b70850 tcp_usr_disconnect() at 0x80620094 = tcp_usr_disconnect+0x74/frame 0xfe2021b70880 soclose() at 0x8051c238 = soclose+0x38/frame 0xfe2021b708b0 _fdrop() at 0x8045639a = _fdrop+0x1a/frame 0xfe2021b708d0 closef() at 0x80458a53 = closef+0x1e3/frame 0xfe2021b70960 closefp() at 0x804567ad = closefp+0x7d/frame 0xfe2021b709a0 amd64_syscall() at 0x806e4051 = amd64_syscall+0x2c1/frame 0xfe2021b70ab0 Xfast_syscall() at 0x806cb2bb = Xfast_syscall+0xfb/frame 0xfe2021b70ab0 --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8019dbeaa, rsp = 0x7fffe6a8, rbp = 0x7fffe6c0 --- Tracing command nginx pid 1060 tid 101749 td 0xf80126a53a00 sched_switch() at 0x804c956d = sched_switch+0x6ad/frame 0xfe2021b7a240 mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe2021b7a270 turnstile_wait() at 0x804ef177 = turnstile_wait+0x2a7/frame 0xfe2021b7a2b0 __rw_wlock_hard() at 0x8049c314 = __rw_wlock_hard+0x94/frame 0xfe2021b7a340 in_lltable_lookup() at 0x80594823 = in_lltable_lookup+0x83/frame
Re: 11.0 stuck on high network load
On Mon, Sep 26, 2016 at 11:33:12AM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote: > > On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote: > >>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > >>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > >>>>> You can also use Dtrace and lockstat (especially with the lockstat -s > >>>>> option): > >>>>> > >>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > >>>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE > >>>>> > >>>>> But I am less familiar with Dtrace/lockstat tools. > >>>> > >>>> I am still use old kernel and got lockdown again. > >>>> Try using lockstat (I am save more output), interesting may be next: > >>>> > >>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 > >>>> events/sec) > >>>> > >>>> --- > >>>> Count indv cuml rcnt nsec Lock Caller > >>>> > >>>> 140839 74% 74% 0.0024659 tcpinp > >>>> tcp_tw_2msl_scan+0xc6 > >>>> > >>>> nsec -- Time Distribution -- count Stack > >>>> > >>>> 4096 | 913 tcp_twstart+0xa3 > >>>> > >>>> 8192 | 58191 > >>>> tcp_do_segment+0x201f > >>>> 16384 |@@ 29594 tcp_input+0xe1c > >>>> > >>>> 32768 | 23447 ip_input+0x15f > >>>> > >>>> 65536 |@@@16197 > >>>> 131072 |@ 8674 > >>>> 262144 | 3358 > >>>> 524288 | 456 > >>>>1048576 | 9 > >>>> --- > >>>> Count indv cuml rcnt nsec Lock Caller > >>>> > >>>> 49180 26% 100% 0.0015929 tcpinp > >>>> tcp_tw_2msl_scan+0xc6 > >>>> > >>>> nsec -- Time Distribution -- count Stack > >>>> > >>>> 4096 | 157 pfslowtimo+0x54 > >>>> > >>>> 8192 |@@@24796 > >>>> softclock_call_cc+0x179 > >>>> 16384 |@@ 11223 softclock+0x44 > >>>> > >>>> 32768 | 7426 > >>>> intr_event_execute_handlers+0x95 > >>>> 65536 |@@ 3918 > >>>> 131072 | 1363 > >>>> 262144 | 278 > >>>> 524288 | 19 > >>>> --- > >>> > >>> This is interesting, it seems that you have two call paths competing > >>> for INP locks here: > >>> > >>> - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > >>> > >>> - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) > >> > >> My current hypothesis: > >> > >> nginx do write() (or may be close()?) to socket, kernel lock > >> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same > >> CPU core and tcp_tw_2msl_scan infinity locked on same inp. > >> > >> In this case you modification can't help, before next try we need some > >> like yeld(). > > > > Or may be locks leaks. > > Or both. > > You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=0) is > infinitely blocked on INP_WLOCK() by "something" (that could be related > to write()). > > As I reached my limit of debugging without WITNESS, could you share
Re: nginx and FreeBSD11
On Mon, Sep 26, 2016 at 06:20:42PM +0300, Konstantin Belousov wrote: > On Thu, Sep 22, 2016 at 12:33:55PM +0300, Slawa Olhovchenkov wrote: > > OK, try this patch. > > Was the patch tested ? No more AIO related issused/nginx core dumps. I Can't get long uptime by other issuses (tcp locks and mbuf related) ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Sep 26, 2016 at 01:57:03PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote: > > On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote: > >> On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: > >>> > >>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > >>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > >>>>> You can also use Dtrace and lockstat (especially with the lockstat -s > >>>>> option): > >>>>> > >>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > >>>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE > >>>>> > >>>>> But I am less familiar with Dtrace/lockstat tools. > >>>> > >>>> I am still use old kernel and got lockdown again. > >>>> Try using lockstat (I am save more output), interesting may be next: > >>>> > >>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 > >>>> events/sec) > >>>> > >>>> --- > >>>> Count indv cuml rcnt nsec Lock Caller > >>>> > >>>> 140839 74% 74% 0.0024659 tcpinp > >>>> tcp_tw_2msl_scan+0xc6 > >>>> > >>>> nsec -- Time Distribution -- count Stack > >>>> > >>>> 4096 | 913 tcp_twstart+0xa3 > >>>> > >>>> 8192 | 58191 > >>>> tcp_do_segment+0x201f > >>>> 16384 |@@ 29594 tcp_input+0xe1c > >>>> > >>>> 32768 | 23447 ip_input+0x15f > >>>> > >>>> 65536 |@@@16197 > >>>> 131072 |@ 8674 > >>>> 262144 | 3358 > >>>> 524288 | 456 > >>>>1048576 | 9 > >>>> --- > >>>> Count indv cuml rcnt nsec Lock Caller > >>>> > >>>> 49180 26% 100% 0.0015929 tcpinp > >>>> tcp_tw_2msl_scan+0xc6 > >>>> > >>>> nsec -- Time Distribution -- count Stack > >>>> > >>>> 4096 | 157 pfslowtimo+0x54 > >>>> > >>>> 8192 |@@@24796 > >>>> softclock_call_cc+0x179 > >>>> 16384 |@@ 11223 softclock+0x44 > >>>> > >>>> 32768 | 7426 > >>>> intr_event_execute_handlers+0x95 > >>>> 65536 |@@ 3918 > >>>> 131072 | 1363 > >>>> 262144 | 278 > >>>> 524288 | 19 > >>>> --- > >>> > >>> This is interesting, it seems that you have two call paths competing > >>> for INP locks here: > >>> > >>> - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > >>> > >>> - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) > >> > >> My current hypothesis: > >> > >> nginx do write() (or may be close()?) to socket, kernel lock > >> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same > >> CPU core and tcp_tw_2msl_scan infinity locked on same inp. > >> > >> In this case you modification can't help, before next try we need some > >> like yeld(). > > > > Or may be locks leaks. > > Or both. > > Actually one extra debug thing you can do is launching lockstat with > below extra options: > > -H For Hold lock stats > -P To get the overall time > -s 20 To get the stackstrace > > To see who is holding the INP lock for so long. Thanks to Hiren for > pointing the -H option. At time of this graph I am collect output from `lockstat -PH -s 5 sleep 1` too and don't see any interesting -- I am think lock holded before lockstat run don't detected and don't showed. I still can show collected output, if you need. hundreds of lines. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Sep 26, 2016 at 11:33:12AM +0200, Julien Charbon wrote: > >>> - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) > >> > >> My current hypothesis: > >> > >> nginx do write() (or may be close()?) to socket, kernel lock > >> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same > >> CPU core and tcp_tw_2msl_scan infinity locked on same inp. > >> > >> In this case you modification can't help, before next try we need some > >> like yeld(). > > > > Or may be locks leaks. > > Or both. > > You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=0) is > infinitely blocked on INP_WLOCK() by "something" (that could be related > to write()). > > As I reached my limit of debugging without WITNESS, could you share > your /etc/sysctl.conf, /boot/loader.conf files? And any specific > configuration you have (like having a Nginx workers affinity, Nginx > special options, etc.). Like that I can try to reproduce it on releng/11.0. I am use double socket server, E5-2620. Double Intel 10G NIC, affinity to CPU 6..11. Nginx affinity to CPU 0..5. I.e. on CPU 0 only nginx worker affinity exist and NIC IRQ handler activity only on CPU 6..11. /boot/loader.conf: kern.geom.label.gptid.enable="0" zfs_load="YES" generated by conf.pl # hw.memtest.tests=0 machdep.hyperthreading_allowed=0 kern.geom.label.disk_ident.enable=0 if_igb_load=yes if_ix_load=yes hw.ix.num_queues=3 hw.ix.rxd=4096 hw.ix.txd=4096 hw.ix.rx_process_limit=-1 hw.ix.tx_process_limit=-1 if_lagg_load=YES net.link.lagg.default_use_flowid=0 accf_http_load=yes aio_load=yes cc_htcp_load=yes kern.ipc.nmbclusters=1048576 net.inet.tcp.reass.maxsegments=32768 net.inet.tcp.hostcache.cachelimit=0 net.inet.tcp.hostcache.hashsize=32768 net.inet.tcp.syncache.hashsize=32768 #net.inet.tcp.tcbhashsize=262144 net.inet.tcp.tcbhashsize=65536 net.inet.tcp.maxtcptw=16384 kern.pin_default_swi=1 kern.pin_pcpu_swi=1 kern.hwpmc.nbuffers=131072 hw.cxgbe.qsize_rxq=16384 hw.cxgbe.qsize_txq=16384 hw.cxgbe.nrxq10g=3 kernel="kernel.VSTREAM" kernels="kernel" hw.mps.max_chains=3072 ### hw.vga.textmode=1 uhci_load=yes ohci_load=yes ehci_load=yes xhci_load=yes ukbd_load=yes umass_load=yes ### boot_multicons="YES" boot_serial="YES" comconsole_speed="115200" comconsole_port=760 #console="comconsole,vidconsole" console="vidconsole,comconsole" hint.uart.0.flags="0x00" hint.uart.1.flags="0x10" /etc/sysctl.conf: kern.random.sys.harvest.ethernet=0 kern.threads.max_threads_per_proc=2 net.inet.ip.maxfragpackets=32768 net.inet.ip.fastforwarding=1 kern.ipc.somaxconn=4096 kern.ipc.nmbjumbop=2097152 kern.ipc.maxsockbuf=16777216 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.sendbuf_inc=16384 net.inet.tcp.sendspace=2097152 #net.inet.tcp.maxtcptw=444800 net.inet.tcp.fast_finwait2_recycle=1 net.inet.tcp.msl=1000 net.inet.tcp.cc.algorithm=htcp net.inet.tcp.per_cpu_timers=1 #net.inet.tcp.syncookies=0 net.inet6.ip6.auto_linklocal=0 kern.maxfiles=30 kern.maxfilesperproc=8 #hw.intr_storm_threshold=9000 vfs.zfs.prefetch_disable=1 vfs.zfs.vdev.max_pending=1000 vfs.zfs.l2arc_noprefetch=0 vfs.zfs.l2arc_norw=0 vfs.zfs.l2arc_write_boost=134217728 vfs.zfs.l2arc_write_max=33554432 vfs.aio.max_aio_procs=512 vfs.aio.max_aio_queue_per_proc=8192 vfs.aio.max_aio_per_proc=8192 vfs.aio.max_aio_queue=65536 net.inet.tcp.finwait2_timeout=5000 kern.corefile=/tmp/%N.%P.core kern.sugid_coredump=1 Now (after this nginx lockout) I am use you patch witch modification: act return NULL at write lock and now see only mbuf-related work. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Mon, Sep 26, 2016 at 10:51:07AM +0200, Julien Charbon wrote: > > 1049 kqread- I 145:58.35 nginx: worker process (nginx) > > 1050 kqread- I 136:33.36 nginx: worker process (nginx) > > 1051 kqread- I 140:59.73 nginx: worker process (nginx) > > 1052 kqread- I 137:18.12 nginx: worker process (nginx) > > > > pid 1046 is nginx running on CPU0 (affinity mask set). > > > > # procstat -k -k 1046 > > PIDTID COMM TDNAME KSTACK > > 1046 100686 nginx-mi_switch+0xd2 > > critical_exit+0x7e lapic_handle_timer+0xb1 Xtimerint+0x8c > > __mtx_lock_sleep+0x168 zone_fetch_slab+0x47 zone_import+0x52 > > zone_alloc_item+0x36 keg_alloc_slab+0x63 keg_fetch_slab+0x16e > > zone_fetch_slab+0x6e zone_import+0x52 uma_zalloc_arg+0x36e m_getm2+0x14f > > m_uiotombuf+0x64 sosend_generic+0x356 soo_write+0x42 dofilewrite+0x87 > > > > Tracing command nginx pid 1046 tid 100686 td 0xf8014485f500 > > sched_switch() at 0x804c956d = sched_switch+0x6ad/frame > > 0xfe20216992a0 /usr/src/sys/kern/sched_ule.c:1973 > > mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe20216992d0 > > /usr/src/sys/kern/kern_synch.c:465 > > critical_exit() at 0x804a6bee = critical_exit+0x7e/frame > > 0xfe20216992f0 /usr/src/sys/kern/kern_switch.c:219 > > lapic_handle_timer() at 0x80771701 = lapic_handle_timer+0xb1/frame > > 0xfe2021699330 /usr/src/sys/x86/x86/local_apic.c:1185 > > Xtimerint() at 0x806cbbcc = Xtimerint+0x8c/frame 0xfe2021699330 > > /usr/src/sys/amd64/amd64/apic_vector.S:135 > > --- interrupt, rip = 0x804de424, rsp = 0xfe2021699400, rbp = > > 0xfe2021699420 --- > > lock_delay() at 0x804de424 = lock_delay+0x54/frame > > 0xfe2021699420 /usr/src/sys/kern/subr_lock.c:127 > > __mtx_lock_sleep() at 0x80484dc8 = __mtx_lock_sleep+0x168/frame > > 0xfe20216994a0 /usr/src/sys/kern/kern_mutex.c:512 > > zone_fetch_slab() at 0x806a4257 = zone_fetch_slab+0x47/frame > > 0xfe20216994e0 /usr/src/sys/vm/uma_core.c:2378 > > zone_import() at 0x806a4312 = zone_import+0x52/frame > > 0xfe2021699530 /usr/src/sys/vm/uma_core.c:2501 > > zone_alloc_item() at 0x806a0986 = zone_alloc_item+0x36/frame > > 0xfe2021699570 /usr/src/sys/vm/uma_core.c:2591 > > keg_alloc_slab() at 0x806a2463 = keg_alloc_slab+0x63/frame > > 0xfe20216995d0 /usr/src/sys/vm/uma_core.c:965 > > keg_fetch_slab() at 0x806a48ce = keg_fetch_slab+0x16e/frame > > 0xfe2021699620 /usr/src/sys/vm/uma_core.c:2349 > > zone_fetch_slab() at 0x806a427e = zone_fetch_slab+0x6e/frame > > 0xfe2021699660 /usr/src/sys/vm/uma_core.c:2375 > > zone_import() at 0x806a4312 = zone_import+0x52/frame > > 0xfe20216996b0 /usr/src/sys/vm/uma_core.c:2501 > > uma_zalloc_arg() at 0x806a147e = uma_zalloc_arg+0x36e/frame > > 0xfe2021699720 /usr/src/sys/vm/uma_core.c:2531 > > m_getm2() at 0x8048231f = m_getm2+0x14f/frame 0xfe2021699790 > > /usr/src/sys/kern/kern_mbuf.c:830 > > m_uiotombuf() at 0x80516044 = m_uiotombuf+0x64/frame > > 0xfe20216997e0 /usr/src/sys/kern/uipc_mbuf.c:1535 > > sosend_generic() at 0x8051ce56 = sosend_generic+0x356/frame > > 0xfe20216998a0 > > soo_write() at 0x804fd872 = soo_write+0x42/frame 0xfe20216998d0 > > dofilewrite() at 0x804f5c97 = dofilewrite+0x87/frame > > 0xfe2021699920 > > kern_writev() at 0x804f5978 = kern_writev+0x68/frame > > 0xfe2021699970 > > sys_writev() at 0x804f5be6 = sys_writev+0x36/frame > > 0xfe20216999a0 > > amd64_syscall() at 0x806e4051 = amd64_syscall+0x2c1/frame > > 0xfe2021699ab0 > > Xfast_syscall() at 0x806cb2bb = Xfast_syscall+0xfb/frame > > 0xfe2021699ab0 > > --- syscall (121, FreeBSD ELF64, sys_writev), rip = 0x8019cc6ba, rsp = > > 0x7fffd688, rbp = 0x7fffd6c0 --- > > This call stack is quite interesting: > 1: A process is calling writev() > 2: Kernel calls sosend_generic() that starts allocating memory > 3: This allocation is then interrupted by the timer interrupt handler > [that could actually trigger tcp_tw_2msl_scan(reuse=0)] > 4: The timer interrupt handler seems to wait on sched_switch() No, this is more interesting: double call (recuersion) to zone_import()! > And fun fact: When sosend_generic() calls m_uiotombuf() it does not > hold INP_WLOCK yet... Yes, is not INP_WLOCK related, this is like next error. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Fri, Sep 23, 2016 at 10:16:56PM +0300, Slawa Olhovchenkov wrote: > On Thu, Sep 22, 2016 at 01:20:45PM +0300, Slawa Olhovchenkov wrote: > > > On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote: > > > > > >> These paths can indeed compete for the same INP lock, as both > > > >> tcp_tw_2msl_scan() calls always start with the first inp found in > > > >> twq_2msl list. But in both cases, this first inp should be quickly > > > >> used > > > >> and its lock released anyway, thus that could explain your situation it > > > >> that the TCP stack is doing that all the time, for example: > > > >> > > > >> - Let say that you are running out completely and constantly of tcptw, > > > >> and then all connections transitioning to TIME_WAIT state are competing > > > >> with the TIME_WAIT timeout scan that tries to free all the expired > > > >> tcptw. If the stack is doing that all the time, it can appear like > > > >> "live" locked. > > > >> > > > >> This is just an hypothesis and as usual might be a red herring. > > > >> Anyway, could you run: > > > >> > > > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' > > > > > > > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > > > > > > > socket: 864, 4192664, 18604, 25348,49276158, 0, > > > > 0 > > > > tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, > > > > 0 > > > > tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, > > > > 0 > > > > tcptw: 88, 16425, 15802, 623,14526919, 8, 0 > > > > tcpreass:40, 32800, 15,2285, 632381, 0, 0 > > > > > > > > In normal case tcptw is about 16425/600/900 > > > > > > > > And after `sysctl -a | grep tcp` system stuck on serial console and I > > > > am reset it. > > > > > > > >> Ideally, once when everything is ok, and once when you have the issue > > > >> to see the differences (if any). > > > >> > > > >> If it appears your are quite low in tcptw, and if you have enough > > > >> memory, could you try increase the tcptw limit using sysctl > > > > > > > > I think this is not eliminate stuck, just may do it less frequency > > > > > > You are right, it would just be a big hint that the tcp_tw_2msl_scan() > > > contention hypothesis is the right one. As I see you have plenty of > > > memory on your server, thus could you try with: > > > > > > net.inet.tcp.maxtcptw=4192665 > > > > > > And see what happen. Just to validate this hypothesis. > > > > This is bad way for validate, with maxtcptw=16384 happened is random > > and can be waited for month. After maxtcptw=4192665 I am don't know > > how long need to wait for verification this hypothesis. > > > > More frequency (may be 3-5 times per day) happening less traffic drops > > (not to zero for minutes). May be this caused also by contention in > > tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating > > CPU power nginx can't service connection and clients closed > > connections and need more TIME_WAIT and can trigered > > tcp_tw_2msl_scan(reuse=1). After this we can got live lock. > > > > May be after I learning to catch and dignostic this validation is more > > accurately. > > Some more bits: > > socket: 864, 4192664, 30806, 790,28524160, 0, 0 > ipq: 56, 32802, 0,1278,1022, 0, 0 > udp_inpcb: 464, 4192664, 44, 364, 14066, 0, 0 > udpcb: 32, 4192750, 44,3081, 14066, 0, 0 > tcp_inpcb: 464, 4192664, 38558, 378,28476709, 0, 0 > tcpcb: 1040, 4192665, 30690, 738,28476709, 0, 0 > tcptw: 88, 32805,7868, 772, 8412249, 0, 0 > > last pid: 49575; load averages: 2.00, 2.05, 3.75up 1+01:12:08 > 22:13:42 > 853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock > CPU 0: 0.0% user, 0.0% nice, 0.0% system, 100% interrupt, 0.0% idle > CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > CPU 3: 0.0
Re: 11.0 stuck on high network load
On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote: > On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: > > > > > Hi Slawa, > > > > On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > > > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > > >> You can also use Dtrace and lockstat (especially with the lockstat -s > > >> option): > > >> > > >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > > >> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE > > >> > > >> But I am less familiar with Dtrace/lockstat tools. > > > > > > I am still use old kernel and got lockdown again. > > > Try using lockstat (I am save more output), interesting may be next: > > > > > > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 > > > events/sec) > > > > > > --- > > > Count indv cuml rcnt nsec Lock Caller > > > > > > 140839 74% 74% 0.0024659 tcpinp > > > tcp_tw_2msl_scan+0xc6 > > > > > > nsec -- Time Distribution -- count Stack > > > > > > 4096 | 913 tcp_twstart+0xa3 > > > > > > 8192 | 58191 > > > tcp_do_segment+0x201f > > > 16384 |@@ 29594 tcp_input+0xe1c > > > > > > 32768 | 23447 ip_input+0x15f > > > > > > 65536 |@@@16197 > > > 131072 |@ 8674 > > > 262144 | 3358 > > > 524288 | 456 > > >1048576 | 9 > > > --- > > > Count indv cuml rcnt nsec Lock Caller > > > > > > 49180 26% 100% 0.0015929 tcpinp > > > tcp_tw_2msl_scan+0xc6 > > > > > > nsec -- Time Distribution -- count Stack > > > > > > 4096 | 157 pfslowtimo+0x54 > > > > > > 8192 |@@@24796 > > > softclock_call_cc+0x179 > > > 16384 |@@ 11223 softclock+0x44 > > > > > > 32768 | 7426 > > > intr_event_execute_handlers+0x95 > > > 65536 |@@ 3918 > > > 131072 | 1363 > > > 262144 | 278 > > > 524288 | 19 > > > --- > > > > This is interesting, it seems that you have two call paths competing > > for INP locks here: > > > > - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > > > > - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) > > My current hypothesis: > > nginx do write() (or may be close()?) to socket, kernel lock > first inp in V_twq_2msl, happen callout for pfslowtimo() on the same > CPU core and tcp_tw_2msl_scan infinity locked on same inp. > > In this case you modification can't help, before next try we need some > like yeld(). Or may be locks leaks. Or both. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > >> You can also use Dtrace and lockstat (especially with the lockstat -s > >> option): > >> > >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > >> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE > >> > >> But I am less familiar with Dtrace/lockstat tools. > > > > I am still use old kernel and got lockdown again. > > Try using lockstat (I am save more output), interesting may be next: > > > > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 > > events/sec) > > > > --- > > Count indv cuml rcnt nsec Lock Caller > > > > 140839 74% 74% 0.0024659 tcpinp tcp_tw_2msl_scan+0xc6 > > > > > > nsec -- Time Distribution -- count Stack > > > > 4096 | 913 tcp_twstart+0xa3 > > > > 8192 | 58191 tcp_do_segment+0x201f > > > > 16384 |@@ 29594 tcp_input+0xe1c > > > > 32768 | 23447 ip_input+0x15f > > > > 65536 |@@@16197 > > 131072 |@ 8674 > > 262144 | 3358 > > 524288 | 456 > >1048576 | 9 > > --- > > Count indv cuml rcnt nsec Lock Caller > > > > 49180 26% 100% 0.0015929 tcpinp tcp_tw_2msl_scan+0xc6 > > > > > > nsec -- Time Distribution -- count Stack > > > > 4096 | 157 pfslowtimo+0x54 > > > > 8192 |@@@24796 > > softclock_call_cc+0x179 > > 16384 |@@ 11223 softclock+0x44 > > > > 32768 | 7426 > > intr_event_execute_handlers+0x95 > > 65536 |@@ 3918 > > 131072 | 1363 > > 262144 | 278 > > 524288 | 19 > > --- > > This is interesting, it seems that you have two call paths competing > for INP locks here: > > - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > > - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) My current hypothesis: nginx do write() (or may be close()?) to socket, kernel lock first inp in V_twq_2msl, happen callout for pfslowtimo() on the same CPU core and tcp_tw_2msl_scan infinity locked on same inp. In this case you modification can't help, before next try we need some like yeld(). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Thu, Sep 22, 2016 at 01:20:45PM +0300, Slawa Olhovchenkov wrote: > On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote: > > > >> These paths can indeed compete for the same INP lock, as both > > >> tcp_tw_2msl_scan() calls always start with the first inp found in > > >> twq_2msl list. But in both cases, this first inp should be quickly used > > >> and its lock released anyway, thus that could explain your situation it > > >> that the TCP stack is doing that all the time, for example: > > >> > > >> - Let say that you are running out completely and constantly of tcptw, > > >> and then all connections transitioning to TIME_WAIT state are competing > > >> with the TIME_WAIT timeout scan that tries to free all the expired > > >> tcptw. If the stack is doing that all the time, it can appear like > > >> "live" locked. > > >> > > >> This is just an hypothesis and as usual might be a red herring. > > >> Anyway, could you run: > > >> > > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' > > > > > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > > > > > socket: 864, 4192664, 18604, 25348,49276158, 0, 0 > > > tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 > > > tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 > > > tcptw: 88, 16425, 15802, 623,14526919, 8, 0 > > > tcpreass:40, 32800, 15,2285, 632381, 0, 0 > > > > > > In normal case tcptw is about 16425/600/900 > > > > > > And after `sysctl -a | grep tcp` system stuck on serial console and I am > > > reset it. > > > > > >> Ideally, once when everything is ok, and once when you have the issue > > >> to see the differences (if any). > > >> > > >> If it appears your are quite low in tcptw, and if you have enough > > >> memory, could you try increase the tcptw limit using sysctl > > > > > > I think this is not eliminate stuck, just may do it less frequency > > > > You are right, it would just be a big hint that the tcp_tw_2msl_scan() > > contention hypothesis is the right one. As I see you have plenty of > > memory on your server, thus could you try with: > > > > net.inet.tcp.maxtcptw=4192665 > > > > And see what happen. Just to validate this hypothesis. > > This is bad way for validate, with maxtcptw=16384 happened is random > and can be waited for month. After maxtcptw=4192665 I am don't know > how long need to wait for verification this hypothesis. > > More frequency (may be 3-5 times per day) happening less traffic drops > (not to zero for minutes). May be this caused also by contention in > tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating > CPU power nginx can't service connection and clients closed > connections and need more TIME_WAIT and can trigered > tcp_tw_2msl_scan(reuse=1). After this we can got live lock. > > May be after I learning to catch and dignostic this validation is more > accurately. Some more bits: socket: 864, 4192664, 30806, 790,28524160, 0, 0 ipq: 56, 32802, 0,1278,1022, 0, 0 udp_inpcb: 464, 4192664, 44, 364, 14066, 0, 0 udpcb: 32, 4192750, 44,3081, 14066, 0, 0 tcp_inpcb: 464, 4192664, 38558, 378,28476709, 0, 0 tcpcb: 1040, 4192665, 30690, 738,28476709, 0, 0 tcptw: 88, 32805,7868, 772, 8412249, 0, 0 last pid: 49575; load averages: 2.00, 2.05, 3.75up 1+01:12:08 22:13:42 853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock CPU 0: 0.0% user, 0.0% nice, 0.0% system, 100% interrupt, 0.0% idle CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 8: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 9: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 10: 0.0% user, 0.0% nice, 0.4% system, 0.0% interr
Re: zvol clone diffs
On Thu, Sep 22, 2016 at 04:56:53PM +0500, Eugene M. Zheganin wrote: > Hi. > > I should mention from the start that this is a question about an > engineering task, not a question about FreeBSD issue. > > I have a set of zvol clones that I redistribute over iSCSI. Several > Windows VMs use these clones as disks via their embedded iSCSI > initiators (each clone represents a disk with an NTFS partition, is > imported as a "foreign" disk and functions just fine). From my opinion, > they should not have any need to do additional writes on these clones > (each VM should only read data, from my point of view). But zfs shows > they do, and sometimes they write a lot of data, so clearly facts and > expactations differ a lot - obviously I didn't take something into > accounting. May be atime like on NTFS? http://serverfault.com/questions/33932/how-do-you-disable-the-last-accessed-attribute-on-ntfs-windows ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote: > >> These paths can indeed compete for the same INP lock, as both > >> tcp_tw_2msl_scan() calls always start with the first inp found in > >> twq_2msl list. But in both cases, this first inp should be quickly used > >> and its lock released anyway, thus that could explain your situation it > >> that the TCP stack is doing that all the time, for example: > >> > >> - Let say that you are running out completely and constantly of tcptw, > >> and then all connections transitioning to TIME_WAIT state are competing > >> with the TIME_WAIT timeout scan that tries to free all the expired > >> tcptw. If the stack is doing that all the time, it can appear like > >> "live" locked. > >> > >> This is just an hypothesis and as usual might be a red herring. > >> Anyway, could you run: > >> > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' > > > > ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP > > > > socket: 864, 4192664, 18604, 25348,49276158, 0, 0 > > tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 > > tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 > > tcptw: 88, 16425, 15802, 623,14526919, 8, 0 > > tcpreass:40, 32800, 15,2285, 632381, 0, 0 > > > > In normal case tcptw is about 16425/600/900 > > > > And after `sysctl -a | grep tcp` system stuck on serial console and I am > > reset it. > > > >> Ideally, once when everything is ok, and once when you have the issue > >> to see the differences (if any). > >> > >> If it appears your are quite low in tcptw, and if you have enough > >> memory, could you try increase the tcptw limit using sysctl > > > > I think this is not eliminate stuck, just may do it less frequency > > You are right, it would just be a big hint that the tcp_tw_2msl_scan() > contention hypothesis is the right one. As I see you have plenty of > memory on your server, thus could you try with: > > net.inet.tcp.maxtcptw=4192665 > > And see what happen. Just to validate this hypothesis. This is bad way for validate, with maxtcptw=16384 happened is random and can be waited for month. After maxtcptw=4192665 I am don't know how long need to wait for verification this hypothesis. More frequency (may be 3-5 times per day) happening less traffic drops (not to zero for minutes). May be this caused also by contention in tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating CPU power nginx can't service connection and clients closed connections and need more TIME_WAIT and can trigered tcp_tw_2msl_scan(reuse=1). After this we can got live lock. May be after I learning to catch and dignostic this validation is more accurately. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: > > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > >> You can also use Dtrace and lockstat (especially with the lockstat -s > >> option): > >> > >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > >> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE > >> > >> But I am less familiar with Dtrace/lockstat tools. > > > > I am still use old kernel and got lockdown again. > > Try using lockstat (I am save more output), interesting may be next: > > > > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 > > events/sec) > > > > --- > > Count indv cuml rcnt nsec Lock Caller > > > > 140839 74% 74% 0.0024659 tcpinp tcp_tw_2msl_scan+0xc6 > > > > > > nsec -- Time Distribution -- count Stack > > > > 4096 | 913 tcp_twstart+0xa3 > > > > 8192 | 58191 tcp_do_segment+0x201f > > > > 16384 |@@ 29594 tcp_input+0xe1c > > > > 32768 | 23447 ip_input+0x15f > > > > 65536 |@@@16197 > > 131072 |@ 8674 > > 262144 | 3358 > > 524288 | 456 > >1048576 | 9 > > --- > > Count indv cuml rcnt nsec Lock Caller > > > > 49180 26% 100% 0.0015929 tcpinp tcp_tw_2msl_scan+0xc6 > > > > > > nsec -- Time Distribution -- count Stack > > > > 4096 | 157 pfslowtimo+0x54 > > > > 8192 |@@@24796 > > softclock_call_cc+0x179 > > 16384 |@@ 11223 softclock+0x44 > > > > 32768 | 7426 > > intr_event_execute_handlers+0x95 > > 65536 |@@ 3918 > > 131072 | 1363 > > 262144 | 278 > > 524288 | 19 > > --- > > This is interesting, it seems that you have two call paths competing > for INP locks here: > > - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and > > - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) I think same. > These paths can indeed compete for the same INP lock, as both > tcp_tw_2msl_scan() calls always start with the first inp found in > twq_2msl list. But in both cases, this first inp should be quickly used > and its lock released anyway, thus that could explain your situation it > that the TCP stack is doing that all the time, for example: > > - Let say that you are running out completely and constantly of tcptw, > and then all connections transitioning to TIME_WAIT state are competing > with the TIME_WAIT timeout scan that tries to free all the expired > tcptw. If the stack is doing that all the time, it can appear like > "live" locked. > > This is just an hypothesis and as usual might be a red herring. > Anyway, could you run: > > $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock' ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP socket: 864, 4192664, 18604, 25348,49276158, 0, 0 tcp_inpcb: 464, 4192664, 34226, 18702,49250593, 0, 0 tcpcb: 1040, 4192665, 18424, 18953,49250593, 0, 0 tcptw: 88, 16425, 15802, 623,14526919, 8, 0 tcpreass:40, 32800, 15,2285, 632381, 0, 0 In normal case tcptw is about 16425/600/900 And after `sysctl -a | grep tcp` system stuck on serial console and I am reset it. > Ideally, once when everything is ok, and once when you have the issue > to see the differences (if any). > > If it appears your are quite low in tcptw, and if you have enough >
Re: 11.0 stuck on high network load
On Thu, Sep 22, 2016 at 11:28:38AM +0200, Julien Charbon wrote: > >>> What purpose to not skip locked tcptw in this loop? > >> > >> If I understand your question correctly: According to your pmcstat > >> result, tcp_tw_2msl_scan() currently struggles with a write lock > >> (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is > >> INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. > > > > As I see in code, tcp_tw_2msl_scan got first node from V_twq_2msl and > > need got RW lock on inp w/o alternates. Can tcp_tw_2msl_scan skip current > > node > > and go to next node in V_twq_2msl list if current node locked by some > > reasson? > > Interesting question indeed: It is not optimal that all simultaneous > calls to tcp_tw_2msl_scan() compete for the same oldest tcptw. The next > tcptws in the list are certainly old enough also. > > Let me see if I can make a simple change that makes kernel threads > calling tcp_tw_2msl_scan() at same time to work on a different old > enough tcptws. So far, I found only solutions quite complex to implement. Simple solution is skip in each thread ncpu elemnts and skip curent cpu number elements at start, if I understund you correctly. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nginx and FreeBSD11
On Thu, Sep 22, 2016 at 11:53:20AM +0300, Konstantin Belousov wrote: > On Thu, Sep 22, 2016 at 11:34:24AM +0300, Slawa Olhovchenkov wrote: > > On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote: > > > > > On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote: > > > > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > > > > > Below is, I believe, the committable fix, of course supposing that > > > > > the patch above worked. If you want to retest it on stable/11, ignore > > > > > efirt.c chunks. > > > > > > > > and remove patch w/ spinlock? > > > Yes. > > > > What you prefer now -- I am test spinlock patch or this patch? > > For success in any case need wait 2-3 days. > > If you already run previous (spinlock) version for 1 day, then finish > with it. I am confident that spinlock version results are indicative for > the refined patch as well. > > If you did not applied the spinlock variant at all, there is no reason to > spend efforts on it, use the patch I sent today. No, I am did not applied the spinlock variant at all. OK, try this patch. Do you still need first 100 lines from verbose boot? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nginx and FreeBSD11
On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote: > On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote: > > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > > > Below is, I believe, the committable fix, of course supposing that > > > the patch above worked. If you want to retest it on stable/11, ignore > > > efirt.c chunks. > > > > and remove patch w/ spinlock? > Yes. What you prefer now -- I am test spinlock patch or this patch? For success in any case need wait 2-3 days. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nginx and FreeBSD11
On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote: > On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > > > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > > > > index a23468e..f754652 100644 > > > > --- a/sys/vm/vm_map.c > > > > +++ b/sys/vm/vm_map.c > > > > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > if (oldvm == newvm) > > > > return; > > > > > > > > + spinlock_enter(); > > > > /* > > > > * Point to the new address space and refer to it. > > > > */ > > > > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > > > > > /* Activate the new mapping. */ > > > > pmap_activate(curthread); > > > > + spinlock_exit(); > > > > > > > > /* Remove the daemon's reference to the old address space. */ > > > > KASSERT(oldvm->vm_refcnt > 1, > Did you tested the patch ? I am now installed it. For success test need 2-3 days. If test failed result may be quickly. > Below is, I believe, the committable fix, of course supposing that > the patch above worked. If you want to retest it on stable/11, ignore > efirt.c chunks. and remove patch w/ spinlock? > diff --git a/sys/amd64/amd64/efirt.c b/sys/amd64/amd64/efirt.c > index f1d67f7..c883af8 100644 > --- a/sys/amd64/amd64/efirt.c > +++ b/sys/amd64/amd64/efirt.c > @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > #include > #include > #include > @@ -301,6 +302,17 @@ efi_enter(void) > PMAP_UNLOCK(curpmap); > return (error); > } > + > + /* > + * IPI TLB shootdown handler invltlb_pcid_handler() reloads > + * %cr3 from the curpmap->pm_cr3, which would disable runtime > + * segments mappings. Block the handler's action by setting > + * curpmap to impossible value. See also comment in > + * pmap.c:pmap_activate_sw(). > + */ > + if (pmap_pcid_enabled && !invpcid_works) > + PCPU_SET(curpmap, NULL); > + > load_cr3(VM_PAGE_TO_PHYS(efi_pml4_page) | (pmap_pcid_enabled ? > curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); > /* > @@ -317,7 +329,9 @@ efi_leave(void) > { > pmap_t curpmap; > > - curpmap = PCPU_GET(curpmap); > + curpmap = >p_vmspace->vm_pmap; > + if (pmap_pcid_enabled && !invpcid_works) > + PCPU_SET(curpmap, curpmap); > load_cr3(curpmap->pm_cr3 | (pmap_pcid_enabled ? > curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); > if (!pmap_pcid_enabled) > diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > index 63042e4..59e1b67 100644 > --- a/sys/amd64/amd64/pmap.c > +++ b/sys/amd64/amd64/pmap.c > @@ -6842,6 +6842,7 @@ pmap_activate_sw(struct thread *td) > { > pmap_t oldpmap, pmap; > uint64_t cached, cr3; > + register_t rflags; > u_int cpuid; > > oldpmap = PCPU_GET(curpmap); > @@ -6865,16 +6866,43 @@ pmap_activate_sw(struct thread *td) > pmap == kernel_pmap, > ("non-kernel pmap thread %p pmap %p cpu %d pcid %#x", > td, pmap, cpuid, pmap->pm_pcids[cpuid].pm_pcid)); > + > + /* > + * If the INVPCID instruction is not available, > + * invltlb_pcid_handler() is used for handle > + * invalidate_all IPI, which checks for curpmap == > + * smp_tlb_pmap. Below operations sequence has a > + * window where %CR3 is loaded with the new pmap's > + * PML4 address, but curpmap value is not yet updated. > + * This causes invltlb IPI handler, called between the > + * updates, to execute as NOP, which leaves stale TLB > + * entries. > + * > + * Note that the most typical use of > + * pmap_activate_sw(), from the context switch, is > + * immune to this race, because interrupts are > + * disabled (while the thread lock is owned), and IPI > + * happends after curpmap is updated. Protect other > + * callers in a similar way, by disabling interrupts > + * around the %cr3 register reload and curpmap > + * assignment. > + */ > + if (!invpcid_works) > + rflags = intr_disable(); > + > if (!cached || (cr3 & ~CR3_PCID_MASK) != pmap->pm_cr3) { > load_cr3(pmap->pm_cr3 | pmap->pm_pcids[cpuid].pm_pcid | > cached); > if (cached) > PCPU_INC(pm_save_cnt); > } > + PCPU_SET(curpmap, pmap); > + if (!invpcid_works) > + intr_restore(rflags); > } else if (cr3 != pmap->pm_cr3) { > load_cr3(pmap->pm_cr3); > + PCPU_SET(curpmap, pmap); > } > -
Re: 11.0 stuck on high network load
On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > > You can also use Dtrace and lockstat (especially with the lockstat -s > option): > > https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks > https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE > > But I am less familiar with Dtrace/lockstat tools. I am still use old kernel and got lockdown again. Try using lockstat (I am save more output), interesting may be next: R/W writer spin on writer: 190019 events in 1.070 seconds (177571 events/sec) --- Count indv cuml rcnt nsec Lock Caller 140839 74% 74% 0.0024659 tcpinp tcp_tw_2msl_scan+0xc6 nsec -- Time Distribution -- count Stack 4096 | 913 tcp_twstart+0xa3 8192 | 58191 tcp_do_segment+0x201f 16384 |@@ 29594 tcp_input+0xe1c 32768 | 23447 ip_input+0x15f 65536 |@@@16197 131072 |@ 8674 262144 | 3358 524288 | 456 1048576 | 9 --- Count indv cuml rcnt nsec Lock Caller 49180 26% 100% 0.0015929 tcpinp tcp_tw_2msl_scan+0xc6 nsec -- Time Distribution -- count Stack 4096 | 157 pfslowtimo+0x54 8192 |@@@24796 softclock_call_cc+0x179 16384 |@@ 11223 softclock+0x44 32768 | 7426 intr_event_execute_handlers+0x95 65536 |@@ 3918 131072 | 1363 262144 | 278 524288 | 19 --- > >> #1. Try above kernel options at least once, and see what you can get. > > > > OK, I am try this after some time. > > > >> #2. If #1 is a total failure try below patch: It won't solve anything, > >> it just makes tcp_tw_2msl_scan() less greedy when there is contention on > >> the INP write lock. If it makes the debugging more feasible, continue > >> to #3. > > > > OK, thanks. > > What purpose to not skip locked tcptw in this loop? > > If I understand your question correctly: According to your pmcstat > result, tcp_tw_2msl_scan() currently struggles with a write lock > (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is > INP_WLOCK. No sign of contention on TW_RLOCK(V_tw_lock) currently. > > 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > 100.0% [2413083] __rw_wlock_hard > 100.0% [2413083]tcp_tw_2msl_scan > > -- > Julien ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11.0 stuck on high network load
On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/20/16 10:26 PM, Slawa Olhovchenkov wrote: > > On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote: > >> On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: > >>> On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: > >>>> > >>>>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] > >>>>> > >>>>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > >>>>> 100.0% [2413083] __rw_wlock_hard > >>>>> 100.0% [2413083]tcp_tw_2msl_scan > >>>>>99.99% [2412958] pfslowtimo > >>>>> 100.0% [2412958] softclock_call_cc > >>>>> 100.0% [2412958] softclock > >>>>> 100.0% [2412958]intr_event_execute_handlers > >>>>>100.0% [2412958] ithread_loop > >>>>> 100.0% [2412958] fork_exit > >>>>>00.01% [125] tcp_twstart > >>>>> 100.0% [125] tcp_do_segment > >>>>> 100.0% [125] tcp_input > >>>>> 100.0% [125]ip_input > >>>>>100.0% [125] swi_net > >>>>> 100.0% [125] intr_event_execute_handlers > >>>>> 100.0% [125] ithread_loop > >>>>> 100.0% [125]fork_exit > >>>> > >>>> The only write lock tcp_tw_2msl_scan() tries to get is a > >>>> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck > >>>> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls > >>>> tcp_tw_2msl_scan() at high rate but this will be quite unexpected). > >>>> > >>>> Thus my hypothesis is that something is holding the INP_WLOCK and not > >>>> releasing it, and tcp_tw_2msl_scan() is spinning on it. > >>>> > >>>> If you can, could you compile the kernel with below options: > >>>> > >>>> optionsDDB # Support DDB. > >>>> optionsDEADLKRES # Enable the deadlock resolver > >>>> optionsINVARIANTS # Enable calls of extra sanity > >>>> checking > >>>> optionsINVARIANT_SUPPORT # Extra sanity checks of internal > >>>> structures, required by INVARIANTS > >>>> optionsWITNESS # Enable checks to detect > >>>> deadlocks and cycles > >>>> optionsWITNESS_SKIPSPIN# Don't run witness on spinlocks > >>>> for speed > >>> > >>> Currently this host run with 100% CPU load (on all cores), i.e. > >>> enabling WITNESS will be significant drop performance. > >>> Can I use only some subset of options? > >>> > >>> Also, I can some troubles to DDB enter in this case. > >>> May be kgdb will be success (not tryed yet)? > >> > >> If these kernel options will certainly slow down your kernel, they also > >> might found the root cause of your issue before reaching the point where > >> you have 100% cpu load on all cores (thanks to INVARIANTS). I would > >> suggest: > > > > Hmmm, may be I am not clarified. > > This host run at peak hours with 100% CPU load as normal operation, > > this is for servering 2x10G, this is CPU load not result of lock > > issuse, this is not us case. And this is because I am fear to enable > > WITNESS -- I am fear drop performance. > > > > This lock issuse happen irregulary and may be caused by other issuse > > (nginx crashed). In this case about 1/3 cores have 100% cpu load, > > perhaps by this lock -- I am can trace only from one core and need > > more then hour for this (may be on other cores different trace, I > > can't guaranted anything). > > I see, especially if you are running in production WITNESS might indeed > be not practical for you. In this case, I would suggest before doing > WITNESS and still get more information to: > > #0: Do a lock profiling: > > https://www.freebsd.org/cgi/man.cgi?query=LOCK_PROFILING > > options LOCK_PROFILING > > Example of usage: > > # Run > $ sudo sysctl debug.lock.prof.enable=1 > $ sleep 10 > $ sudo sysctl debug.lock.prof.enable=0 > > # Get results > $ sysctl debug.lock.
Re: nginx and FreeBSD11
On Tue, Sep 20, 2016 at 04:00:10PM -0600, Warner Losh wrote: > >> > > Is this sandy bridge ? > >> > > >> > Sandy Bridge EP > >> > > >> > > Show me first 100 lines of the verbose dmesg, > >> > > >> > After day or two, after end of this test run -- I am need to enable > >> > verbose. > >> > > >> > > I want to see cpu features lines. In particular, does you CPU support > >> > > the INVPCID feature. > >> > > >> > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) > >> > Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 > >> > > >> > Features=0xbfebfbff> >> > > >> > Features2=0x1fbee3ff > >> > AMD Features=0x2c100800 > >> > AMD Features2=0x1 > >> > XSAVE Features=0x1 > >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID > >> > TSC: P-state invariant, performance statistics > >> > > >> > I am don't see this feature before E5v3: > >> > > >> > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) > >> > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 > >> > > >> > Features=0xbfebfbff > >> > > >> > Features2=0x7fbee3ff > >> > AMD Features=0x2c100800 > >> > AMD Features2=0x1 > >> > Structured Extended Features=0x281 > >> > XSAVE Features=0x1 > >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > >> > TSC: P-state invariant, performance statistics > >> > > >> > (don't run 11.0 on this CPU) > >> Ok. > >> > >> > > >> > CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) > >> > Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 > >> > > >> > Features=0xbfebfbff > >> > > >> > Features2=0x7ffefbff > >> > AMD Features=0x2c100800 > >> > AMD Features2=0x21 > >> > Structured Extended > >> > Features=0x37ab > >> > XSAVE Features=0x1 > >> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > >> > TSC: P-state invariant, performance statistics > >> > > >> > (11.0 run w/o this issuse) > >> Do you mean that similarly configured nginx+aio do not demonstrate the > >> corruption on this machine ? > > > > Yes. > > But different storage configuration and different pattern load. > > > > Also 11.0 run w/o this issuse on > > > > CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (2200.04-MHz K8-class CPU) > > Origin="GenuineIntel" Id=0x406f1 Family=0x6 Model=0x4f Stepping=1 > > > > Features=0xbfebfbff > > > > Features2=0x7ffefbff > > AMD Features=0x2c100800 > > AMD Features2=0x121 > > Structured Extended > > Features=0x21cbfbb > > XSAVE Features=0x1 > > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr > > TSC: P-state invariant, performance statistics > > > > PS: all systems is dual-cpu. > > Does this mean 2 cores or two sockets? We've seen a similar hang with > the following CPU: two sockets. not sure how this impotant, just for record. you system also w/o INVPCID feature (as kib question). may be you case also will be resolved by vm.pmap.pcid_enabled=0? > CPU: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (2700.06-MHz K8-class CPU) > Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 > > Features=0xbfebfbff > > Features2=0x7fbee3ff > AMD Features=0x2c100800 > AMD Features2=0x1 > Structured Extended
Re: nginx and FreeBSD11
On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > On Tue, Sep 20, 2016 at 11:38:54PM +0300, Slawa Olhovchenkov wrote: > > On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote: > > > > > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: > > > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > > > > > > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > > > > > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > > > > > some reason. > > > > > > > > > > > > > > I am try using next DTrace script: > > > > > > > > > > > > > > #pragma D option dynvarsize=64m > > > > > > > > > > > > > > int req[struct vmspace *, void *]; > > > > > > > self int trace; > > > > > > > > > > > > > > syscall:freebsd:aio_read:entry > > > > > > > { > > > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct > > > > > > > aiocb)); > > > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = > > > > > > > curthread->td_proc->p_pid; > > > > > > > } > > > > > > > > > > > > > > fbt:kernel:aio_process_rw:entry > > > > > > > { > > > > > > > self->job = args[0]; > > > > > > > self->trace = 1; > > > > > > > } > > > > > > > > > > > > > > fbt:kernel:aio_process_rw:return > > > > > > > /self->trace/ > > > > > > > { > > > > > > > req[self->job->userproc->p_vmspace, > > > > > > > self->job->uaiocb.aio_buf] = 0; > > > > > > > self->job = 0; > > > > > > > self->trace = 0; > > > > > > > } > > > > > > > > > > > > > > fbt:kernel:vn_io_fault:entry > > > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, > > > > > > > args[1]->uio_iov[0].iov_base]/ > > > > > > > { > > > > > > > this->buf = args[1]->uio_iov[0].iov_base; > > > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, > > > > > > > curthread->td_proc->p_vmspace, this->buf, > > > > > > > req[curthread->td_proc->p_vmspace, this->buf]); > > > > > > > } > > > > > > > === > > > > > > > > > > > > > > And don't got any messages near nginx core dump. > > > > > > > What I can check next? > > > > > > > May be check context/address space switch for kernel process? > > > > > > > > > > > > Which CPU are you using? > > > > > > > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class > > > > > CPU) > > > Is this sandy bridge ? > > > > Sandy Bridge EP > > > > > Show me first 100 lines of the verbose dmesg, > > > > After day or two, after end of this test run -- I am need to enable verbose. > > > > > I want to see cpu features lines. In particular, does you CPU support > > > the INVPCID feature. > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) > > Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 > > > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > > > > Features2=0x1fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX> > > AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> > > AMD Features2=0x1 > > XSAVE Features=0x1 > > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID > > TSC: P-state invariant, performance statistics > > > > I am don't see this feature before E5v3: > > > > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) > >
Re: nginx and FreeBSD11
On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote: > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote: > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > > > some reason. > > > > > > > > > > I am try using next DTrace script: > > > > > > > > > > #pragma D option dynvarsize=64m > > > > > > > > > > int req[struct vmspace *, void *]; > > > > > self int trace; > > > > > > > > > > syscall:freebsd:aio_read:entry > > > > > { > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct > > > > > aiocb)); > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = > > > > > curthread->td_proc->p_pid; > > > > > } > > > > > > > > > > fbt:kernel:aio_process_rw:entry > > > > > { > > > > > self->job = args[0]; > > > > > self->trace = 1; > > > > > } > > > > > > > > > > fbt:kernel:aio_process_rw:return > > > > > /self->trace/ > > > > > { > > > > > req[self->job->userproc->p_vmspace, > > > > > self->job->uaiocb.aio_buf] = 0; > > > > > self->job = 0; > > > > > self->trace = 0; > > > > > } > > > > > > > > > > fbt:kernel:vn_io_fault:entry > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, > > > > > args[1]->uio_iov[0].iov_base]/ > > > > > { > > > > > this->buf = args[1]->uio_iov[0].iov_base; > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, > > > > > curthread->td_proc->p_vmspace, this->buf, > > > > > req[curthread->td_proc->p_vmspace, this->buf]); > > > > > } > > > > > === > > > > > > > > > > And don't got any messages near nginx core dump. > > > > > What I can check next? > > > > > May be check context/address space switch for kernel process? > > > > > > > > Which CPU are you using? > > > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > Is this sandy bridge ? Sandy Bridge EP > Show me first 100 lines of the verbose dmesg, After day or two, after end of this test run -- I am need to enable verbose. > I want to see cpu features lines. In particular, does you CPU support > the INVPCID feature. CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU) Origin="GenuineIntel" Id=0x206d7 Family=0x6 Model=0x2d Stepping=7 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x1fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x1 XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics I am don't see this feature before E5v3: CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU) Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x7fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x1 Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS> XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics (don't run 11.0 on this CPU) CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU) Origin="GenuineIntel" Id=0x306f2 Family=0x6 Model=0x3f Stepping=2 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE&g
Re: 11.0 stuck on high network load
On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote: > > On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote: > >> > >>> @ CPU_CLK_UNHALTED_CORE [4653445 samples] > >>> > >>> 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel > >>> 100.0% [2413083] __rw_wlock_hard > >>> 100.0% [2413083]tcp_tw_2msl_scan > >>>99.99% [2412958] pfslowtimo > >>> 100.0% [2412958] softclock_call_cc > >>> 100.0% [2412958] softclock > >>> 100.0% [2412958]intr_event_execute_handlers > >>>100.0% [2412958] ithread_loop > >>> 100.0% [2412958] fork_exit > >>>00.01% [125] tcp_twstart > >>> 100.0% [125] tcp_do_segment > >>> 100.0% [125] tcp_input > >>> 100.0% [125]ip_input > >>>100.0% [125] swi_net > >>> 100.0% [125] intr_event_execute_handlers > >>> 100.0% [125] ithread_loop > >>> 100.0% [125]fork_exit > >> > >> The only write lock tcp_tw_2msl_scan() tries to get is a > >> INP_WLOCK(inp). Thus here, tcp_tw_2msl_scan() seems to be stuck > >> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls > >> tcp_tw_2msl_scan() at high rate but this will be quite unexpected). > >> > >> Thus my hypothesis is that something is holding the INP_WLOCK and not > >> releasing it, and tcp_tw_2msl_scan() is spinning on it. > >> > >> If you can, could you compile the kernel with below options: > >> > >> optionsDDB # Support DDB. > >> optionsDEADLKRES # Enable the deadlock resolver > >> optionsINVARIANTS # Enable calls of extra sanity > >> checking > >> optionsINVARIANT_SUPPORT # Extra sanity checks of internal > >> structures, required by INVARIANTS > >> optionsWITNESS # Enable checks to detect > >> deadlocks and cycles > >> optionsWITNESS_SKIPSPIN# Don't run witness on spinlocks > >> for speed > > > > Currently this host run with 100% CPU load (on all cores), i.e. > > enabling WITNESS will be significant drop performance. > > Can I use only some subset of options? > > > > Also, I can some troubles to DDB enter in this case. > > May be kgdb will be success (not tryed yet)? > > If these kernel options will certainly slow down your kernel, they also > might found the root cause of your issue before reaching the point where > you have 100% cpu load on all cores (thanks to INVARIANTS). I would > suggest: Hmmm, may be I am not clarified. This host run at peak hours with 100% CPU load as normal operation, this is for servering 2x10G, this is CPU load not result of lock issuse, this is not us case. And this is because I am fear to enable WITNESS -- I am fear drop performance. This lock issuse happen irregulary and may be caused by other issuse (nginx crashed). In this case about 1/3 cores have 100% cpu load, perhaps by this lock -- I am can trace only from one core and need more then hour for this (may be on other cores different trace, I can't guaranted anything). > #1. Try above kernel options at least once, and see what you can get. OK, I am try this after some time. > #2. If #1 is a total failure try below patch: It won't solve anything, > it just makes tcp_tw_2msl_scan() less greedy when there is contention on > the INP write lock. If it makes the debugging more feasible, continue > to #3. OK, thanks. What purpose to not skip locked tcptw in this loop? > diff --git a/sys/netinet/tcp_timewait.c b/sys/netinet/tcp_timewait.c > index a8b78f9..4206ea3 100644 > --- a/sys/netinet/tcp_timewait.c > +++ b/sys/netinet/tcp_timewait.c > @@ -701,34 +701,42 @@ tcp_tw_2msl_scan(int reuse) > in_pcbref(inp); > TW_RUNLOCK(V_tw_lock); > > +retry: > if (INP_INFO_TRY_RLOCK(_tcbinfo)) { > > - INP_WLOCK(inp); > - tw = intotw(inp); > - if (in_pcbrele_wlocked(inp)) { > - KASSERT(tw == NULL, ("%s: held last inp " > - "reference but tw not NULL", __func__)); > - INP_INFO_RUNLOCK(_tcbinfo); > -
Re: nginx and FreeBSD11
On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote: > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote: > > > > > If this panics, then vmspace_switch_aio() is not working for > > > > some reason. > > > > > > I am try using next DTrace script: > > > > > > #pragma D option dynvarsize=64m > > > > > > int req[struct vmspace *, void *]; > > > self int trace; > > > > > > syscall:freebsd:aio_read:entry > > > { > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb)); > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = > > > curthread->td_proc->p_pid; > > > } > > > > > > fbt:kernel:aio_process_rw:entry > > > { > > > self->job = args[0]; > > > self->trace = 1; > > > } > > > > > > fbt:kernel:aio_process_rw:return > > > /self->trace/ > > > { > > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = > > > 0; > > > self->job = 0; > > > self->trace = 0; > > > } > > > > > > fbt:kernel:vn_io_fault:entry > > > /self->trace && !req[curthread->td_proc->p_vmspace, > > > args[1]->uio_iov[0].iov_base]/ > > > { > > > this->buf = args[1]->uio_iov[0].iov_base; > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, > > > curthread->td_proc->p_vmspace, this->buf, > > > req[curthread->td_proc->p_vmspace, this->buf]); > > > } > > > === > > > > > > And don't got any messages near nginx core dump. > > > What I can check next? > > > May be check context/address space switch for kernel process? > > > > Which CPU are you using? > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU) > > > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from > > loader prompt or loader.conf)? (Wondering if pmap_activate() is somehow > > not switching) I am need some more time to test (day or two), but now this is like workaround/solution: 12h runtime and peak hour w/o nginx crash. (vm.pmap.pcid_enabled=0 in loader.conf). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"