from:"Slawa Olhovchenkov"

Re: Suspected mbuf leak with Nginx + sendfile + TLS in 12.2-STABLE

2021-02-06 Thread Slawa Olhovchenkov

On Fri, Feb 05, 2021 at 11:54:07AM +0100, GomoR wrote:

> On 2021-02-05 09:11, GomoR wrote:
> >> The first step I would do if possible would be to bisect between the 
> >> last
> >> known working version and the version that is known to be broken to
> >> determine which commit introduced the problem.  One thing that could 
> >> help
> >> here is to see if you can reproduce the problem using a 12.2 kernel on 
> >> a
> >> 12.1 world + ports.  If you can, then you can limit your bisecting to 
> >> just
> >> building new kernels which will make that process quicker.
> 
> We have reinstalled from scratch our system with FreeBSD 12.1-RELEASE. 
> We then
> have installed just enough of our software stack to reproduce the issue.
> 
> No problem with a stock 12.1-RELEASE kernel, but problem arise after 
> installkernel
> with the latest 12.2-STABLE. We then turned off all our customizations, 
> including
> some specific sysctl.conf values. The bug didn't triggered.
> 
> After dissecting our sysctl values, the faulty one has been identified:
> 
> kern.ipc.maxsockbuf=157286400
> 
> This value is 75 times the default value (2097152). Restoring the 
> default value
> fixes the issue. After some tests, the bug is triggered starting 
> somewhere to
> 64 times the default value (134217728).
> 
> There was no issue with this setting in 12.1-RELEASE, but there is in 
> 12.2-RELEASE.
> 
> Do you have some insights onto why it causes that mbuf problems? In the 
> meantime,
> we have our solution, but we are willing to help identify if that's a 
> kernel bug
> or just a real bad idea to set maxsockbuf to such a high value.

===
> Each time a user downloads a file, mbuf & mbuf_clusters are raising to
> reach the maximum limit in a matter of seconds. Those values are
> asserted by 'netstat -m' as follows:
>
> Normal situation:
>
> mbuf:   256, 26031105,   16767,5974,428087938,   0,   0
> mbuf_cluster:  2048, 8135232,   18408,2704,101644203,   0,   0
>
> Warning situtation:
>
> mbuf:   256, 26031105, 2981516,  151205,1109483561,   0,   0
> mbuf_cluster:  2048, 8135232, 2983155,4201,319714617,   0,   0
===

Can you clarified what is problem?
I.e. under load system used more resources and this is not bug.
Do you see more resources usage compared to load?
Or resources don't freed after drop load?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ses device over T-SGPIO

2021-01-29 Thread Slawa Olhovchenkov

On Fri, Jan 29, 2021 at 10:12:03AM -0700, Alan Somers wrote:

> What does "camcontrol devlist" show?

Only 2 disk: usb-flash and da1 (isci connected)
(I am currently just boot from 12.2 install)

> On Fri, Jan 29, 2021 at 10:06 AM Slawa Olhovchenkov  wrote:
> 
> > On Fri, Jan 29, 2021 at 09:51:33AM -0700, Alan Somers wrote:
> >
> > > I've never used any tool with SGPIO.  The hardware simply isn't powerful
> > > enough to be useful.  sesutil works, in theory, to control the LEDs.  But
> > > it's of limited usefulness since there's no way to tell which drives are
> > > installed in which slots.
> >
> > For me sesutil failed w/ "No SES device found"
> >
> > >
> > > On Fri, Jan 29, 2021 at 9:44 AM Slawa Olhovchenkov 
> > wrote:
> > >
> > > > On Fri, Jan 29, 2021 at 09:22:47AM -0700, Alan Somers wrote:
> > > >
> > > > > The short story is: SGPIO sucks.  It doesn't detect drive presence,
> > much
> > > > > less provide physical path information.  The only thing you can do
> > with
> > > > it
> > > > > is control the fault LEDs.  But doing that usefully requires you to
> > have
> > > > > some extra source of information about what drives are installed in
> > what
> > > > > slots.  Basically, you need to track that kind of information
> > offline.
> > > > > sesutil ought to be able to control the LEDs, at least, but I've
> > never
> > > > > personally used it with SGPIO.
> > > >
> > > > What tool you used with SGPIO?
> > > > What additional drivers need?
> > > >
> > > > > On Fri, Jan 29, 2021 at 9:02 AM Slawa Olhovchenkov 
> > > > wrote:
> > > > >
> > > > > > I am have Supermicro MB X9DBU-iF connected to bcakplane
> > BPN-SAS-825TQ
> > > > > > by T-SGPIO cables. sesutil don't found any SES device.
> > > > > >
> > > > > > Is this posible to have control to this backplane?
> > > > > > ___
> > > > > > freebsd-stable@freebsd.org mailing list
> > > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > > > > To unsubscribe, send any mail to "
> > > > freebsd-stable-unsubscr...@freebsd.org"
> > > > > >
> > > > > ___
> > > > > freebsd-stable@freebsd.org mailing list
> > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > > > To unsubscribe, send any mail to "
> > freebsd-stable-unsubscr...@freebsd.org
> > > > "
> > > >
> > > ___
> > > freebsd-stable@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
> > "
> >
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ses device over T-SGPIO

2021-01-29 Thread Slawa Olhovchenkov

On Fri, Jan 29, 2021 at 09:51:33AM -0700, Alan Somers wrote:

> I've never used any tool with SGPIO.  The hardware simply isn't powerful
> enough to be useful.  sesutil works, in theory, to control the LEDs.  But
> it's of limited usefulness since there's no way to tell which drives are
> installed in which slots.

For me sesutil failed w/ "No SES device found"

> 
> On Fri, Jan 29, 2021 at 9:44 AM Slawa Olhovchenkov  wrote:
> 
> > On Fri, Jan 29, 2021 at 09:22:47AM -0700, Alan Somers wrote:
> >
> > > The short story is: SGPIO sucks.  It doesn't detect drive presence, much
> > > less provide physical path information.  The only thing you can do with
> > it
> > > is control the fault LEDs.  But doing that usefully requires you to have
> > > some extra source of information about what drives are installed in what
> > > slots.  Basically, you need to track that kind of information offline.
> > > sesutil ought to be able to control the LEDs, at least, but I've never
> > > personally used it with SGPIO.
> >
> > What tool you used with SGPIO?
> > What additional drivers need?
> >
> > > On Fri, Jan 29, 2021 at 9:02 AM Slawa Olhovchenkov 
> > wrote:
> > >
> > > > I am have Supermicro MB X9DBU-iF connected to bcakplane BPN-SAS-825TQ
> > > > by T-SGPIO cables. sesutil don't found any SES device.
> > > >
> > > > Is this posible to have control to this backplane?
> > > > ___
> > > > freebsd-stable@freebsd.org mailing list
> > > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > > To unsubscribe, send any mail to "
> > freebsd-stable-unsubscr...@freebsd.org"
> > > >
> > > ___
> > > freebsd-stable@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org
> > "
> >
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ses device over T-SGPIO

2021-01-29 Thread Slawa Olhovchenkov

On Fri, Jan 29, 2021 at 09:22:47AM -0700, Alan Somers wrote:

> The short story is: SGPIO sucks.  It doesn't detect drive presence, much
> less provide physical path information.  The only thing you can do with it
> is control the fault LEDs.  But doing that usefully requires you to have
> some extra source of information about what drives are installed in what
> slots.  Basically, you need to track that kind of information offline.
> sesutil ought to be able to control the LEDs, at least, but I've never
> personally used it with SGPIO.

What tool you used with SGPIO?
What additional drivers need?

> On Fri, Jan 29, 2021 at 9:02 AM Slawa Olhovchenkov  wrote:
> 
> > I am have Supermicro MB X9DBU-iF connected to bcakplane BPN-SAS-825TQ
> > by T-SGPIO cables. sesutil don't found any SES device.
> >
> > Is this posible to have control to this backplane?
> > ___
> > freebsd-stable@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> >
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ses device over T-SGPIO

2021-01-29 Thread Slawa Olhovchenkov

I am have Supermicro MB X9DBU-iF connected to bcakplane BPN-SAS-825TQ
by T-SGPIO cables. sesutil don't found any SES device.

Is this posible to have control to this backplane?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ncurses in 12-stable break emacs tramp mode

2020-04-19 Thread Slawa Olhovchenkov

Before ncurses update emcas tramp mode got next echo string

_echo^H ^H^H ^H^H ^H^H ^H^H ^Hstty

after ncurses update echo string is different:

_echo^M#$ _ech ^H^M#$ _ec ^H^M#$ _e ^H^M#$ _ ^H^M#$  ^Hstty icanon erase ^H 
cols 32767_echo

i.e. ncurses on `dumb` terminal still do refresh all line, from begin
of string, include prompt. This is complety break emacs tramp mode to
FreeBSD host.

Is this posible to fix this?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Slawa Olhovchenkov

On Sun, Apr 12, 2020 at 07:08:06PM +0200, Stefan Bethke wrote:

> Am 12.04.2020 um 19:03 schrieb Slawa Olhovchenkov :
> > 
> > On Sun, Apr 12, 2020 at 06:38:10PM +0200, Stefan Bethke wrote:
> > 
> >> 
> >> 
> >>> Am 12.04.2020 um 18:31 schrieb Slawa Olhovchenkov :
> >>> 
> >>> On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote:
> >>> 
> >>>> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov :
> >>>>> 
> >>>>> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:
> >>>>> 
> >>>>>> I have a server I don't have physical access to right now, which has a 
> >>>>>> broken SATA disk that produces mostly errors (but not entirely).
> >>>>>> 
> >>>>>> The disk has two partitions that are part of a zpool each. I can't 
> >>>>>> bring the system up with this disk being online, because ZFS is trying 
> >>>>>> its darndest to use it.
> >>>>>> 
> >>>>>> I already renamed the GPT partitions in the hope that ZFS would not 
> >>>>>> find them anymore, but it does.
> >>>>>> 
> >>>>>> I can't gpart destroy -f ada1 because "device busy".
> >>>>>> 
> >>>>>> Is there a way, ideally in the loader, to tell the kernel to ignore 
> >>>>>> ada1 and/or ahcich5? Or can I force ZFS some other way to ignore the 
> >>>>>> disk? I do have a spare disk I can use to replace the failed one, but 
> >>>>>> I can't get the machine into a state where I could even issue the 
> >>>>>> zpool replace command.
> >>>>> 
> >>>>> `zpool offline pool device` if you have enoght redundancy?
> >>>> 
> >>>> I do, but the command doesn't return. Instead, I'm getting loads of sata 
> >>>> error message.
> >>> 
> >>> What you zpool configuration?
> >> 
> >> This is from the working system. The identifiers are slightly different, 
> >> but the structure is identical.
> > 
> > what about `zpool detach  ` ?
> 
> Now I can't boot into single user mode anymore, ZFS just waits forever, and 
> the kernel is printing an endless chain of SATA error messages.
> 
> I really need a way to remove the broken disk before ZFS tries to access it, 
> or a way to stop ZFS from try to access the disk.

This disk only part of mirror? ZIL is OK?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Slawa Olhovchenkov

On Sun, Apr 12, 2020 at 06:38:10PM +0200, Stefan Bethke wrote:

> 
> 
> > Am 12.04.2020 um 18:31 schrieb Slawa Olhovchenkov :
> > 
> > On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote:
> > 
> >> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov :
> >>> 
> >>> On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:
> >>> 
> >>>> I have a server I don't have physical access to right now, which has a 
> >>>> broken SATA disk that produces mostly errors (but not entirely).
> >>>> 
> >>>> The disk has two partitions that are part of a zpool each. I can't bring 
> >>>> the system up with this disk being online, because ZFS is trying its 
> >>>> darndest to use it.
> >>>> 
> >>>> I already renamed the GPT partitions in the hope that ZFS would not find 
> >>>> them anymore, but it does.
> >>>> 
> >>>> I can't gpart destroy -f ada1 because "device busy".
> >>>> 
> >>>> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
> >>>> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I 
> >>>> do have a spare disk I can use to replace the failed one, but I can't 
> >>>> get the machine into a state where I could even issue the zpool replace 
> >>>> command.
> >>> 
> >>> `zpool offline pool device` if you have enoght redundancy?
> >> 
> >> I do, but the command doesn't return. Instead, I'm getting loads of sata 
> >> error message.
> > 
> > What you zpool configuration?
> 
> This is from the working system. The identifiers are slightly different, but 
> the structure is identical.

what about `zpool detach  ` ?

> # zpool status
>   pool: data
>  state: ONLINE
> status: Some supported features are not enabled on the pool. The pool can
>   still be used, but some features are unavailable.
> action: Enable all features using 'zpool upgrade'. Once this is done,
>   the pool may no longer be accessible by software that does not support
>   the features. See zpool-features(7) for details.
>   scan: resilvered 176K in 0 days 00:01:28 with 0 errors on Sun May 26 
> 21:24:54 2019
> config:
> 
>   NAME  STATE READ WRITE CKSUM
>   data  ONLINE   0 0 0
> mirror-0ONLINE   0 0 0
>   gpt/ls0data   ONLINE   0 0 0
>   gpt/ls1data   ONLINE   0 0 0
>   logs
> gpt/data0logONLINE   0 0 0
>   cache
> gpt/data0cache  ONLINE   0 0 0
> 
> errors: No known data errors
> 
>   pool: ls-host
>  state: ONLINE
> status: Some supported features are not enabled on the pool. The pool can
>   still be used, but some features are unavailable.
> action: Enable all features using 'zpool upgrade'. Once this is done,
>   the pool may no longer be accessible by software that does not support
>   the features. See zpool-features(7) for details.
>   scan: scrub repaired 0 in 0 days 00:06:33 with 0 errors on Sun Apr 12 
> 11:46:25 2020
> config:
> 
>   NAME  STATE READ WRITE CKSUM
>   ls-host   ONLINE   0 0 0
> mirror-0ONLINE   0 0 0
>   gpt/ls0host   ONLINE   0 0 0
>   gpt/ls1host   ONLINE   0 0 0
>   logs
> gpt/host0logONLINE   0 0 0
>   cache
> gpt/host0cache  ONLINE   0 0 0
> 
> errors: No known data errors
> 
> 
> --
> Stefan BethkeFon +49 151 14070811
> 


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Slawa Olhovchenkov

On Sun, Apr 12, 2020 at 06:24:09PM +0200, Stefan Bethke wrote:

> Am 12.04.2020 um 17:43 schrieb Slawa Olhovchenkov :
> > 
> > On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:
> > 
> >> I have a server I don't have physical access to right now, which has a 
> >> broken SATA disk that produces mostly errors (but not entirely).
> >> 
> >> The disk has two partitions that are part of a zpool each. I can't bring 
> >> the system up with this disk being online, because ZFS is trying its 
> >> darndest to use it.
> >> 
> >> I already renamed the GPT partitions in the hope that ZFS would not find 
> >> them anymore, but it does.
> >> 
> >> I can't gpart destroy -f ada1 because "device busy".
> >> 
> >> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
> >> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do 
> >> have a spare disk I can use to replace the failed one, but I can't get the 
> >> machine into a state where I could even issue the zpool replace command.
> > 
> > `zpool offline pool device` if you have enoght redundancy?
> 
> I do, but the command doesn't return. Instead, I'm getting loads of sata 
> error message.

What you zpool configuration?


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: make kernel ignore broken SATA disk

2020-04-12 Thread Slawa Olhovchenkov

On Sun, Apr 12, 2020 at 04:37:06PM +0200, Stefan Bethke wrote:

> I have a server I don't have physical access to right now, which has a broken 
> SATA disk that produces mostly errors (but not entirely).
> 
> The disk has two partitions that are part of a zpool each. I can't bring the 
> system up with this disk being online, because ZFS is trying its darndest to 
> use it.
> 
> I already renamed the GPT partitions in the hope that ZFS would not find them 
> anymore, but it does.
> 
> I can't gpart destroy -f ada1 because "device busy".
> 
> Is there a way, ideally in the loader, to tell the kernel to ignore ada1 
> and/or ahcich5? Or can I force ZFS some other way to ignore the disk? I do 
> have a spare disk I can use to replace the failed one, but I can't get the 
> machine into a state where I could even issue the zpool replace command.

`zpool offline pool device` if you have enoght redundancy?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Access to NETMAP from c++ program

2019-12-23 Thread Slawa Olhovchenkov

On Mon, Nov 25, 2019 at 03:36:21PM -0500, Ryan Stone wrote:

> Remove "using namespace std;" from your program.

I am don't have "using namespace std;".

Example:

===
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
===

Yes, only includes.

c++ -c tatomic.cc
In file included from tatomic.cc:11:
In file included from /usr/include/net/netmap_user.h:104:
In file included from /usr/include/net/netmap.h:816:
/usr/include/stdatomic.h:186:17: error: unknown type name '_Bool'
typedef _Atomic(_Bool)  atomic_bool;
^
/usr/include/stdatomic.h:186:26: error: C++ requires a type specifier for all 
declarations
typedef _Atomic(_Bool)  atomic_bool;
~~ ^
/usr/include/stdatomic.h:379:17: error: unknown type name '_Bool'
static __inline _Bool
^
/usr/include/stdatomic.h:383:10: error: address argument to atomic operation 
must be a pointer to _Atomic type ('volatile atomic_bool *' (aka 'volatile int 
*') invalid)
return (atomic_exchange_explicit(&__object->__flag, 1, __order));
^~
/usr/include/stdatomic.h:242:2: note: expanded from macro 
'atomic_exchange_explicit'
__c11_atomic_exchange(object, desired, order)
^ ~~
/usr/include/stdatomic.h:390:2: error: address argument to atomic operation 
must be a pointer to _Atomic type ('volatile atomic_bool *' (aka 'volatile int 
*') invalid)
atomic_store_explicit(&__object->__flag, 0, __order);
^ ~
/usr/include/stdatomic.h:256:2: note: expanded from macro 
'atomic_store_explicit'
__c11_atomic_store(object, desired, order)
^  ~~
/usr/include/stdatomic.h:394:17: error: unknown type name '_Bool'
static __inline _Bool
^
6 errors generated.

Ok, try ugly hack for _Bool:

===
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
typedefint _Bool;
#include 
===

No, errors. Now try 

===
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
typedefint _Bool;
#include 
#include 
#include 
===

Many errors:
In file included from tatomic.cc:13:
In file included from /usr/include/c++/v1/memory:668:
/usr/include/c++/v1/atomic:1166:49: error: expected ')'
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/c++/v1/atomic:1166:1: note: to match this '('
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/stdatomic.h:176:32: note: expanded from macro 'atomic_is_lock_free'
__atomic_is_lock_free(sizeof(*(obj)), obj)
  ^
In file included from tatomic.cc:13:
In file included from /usr/include/c++/v1/memory:668:
/usr/include/c++/v1/atomic:1166:1: error: expected expression
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/stdatomic.h:176:37: note: expanded from macro 'atomic_is_lock_free'
__atomic_is_lock_free(sizeof(*(obj)), obj)
   ^
In file included from tatomic.cc:13:
In file included from /usr/include/c++/v1/memory:668:
/usr/include/c++/v1/atomic:1166:21: error: expected expression
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/c++/v1/atomic:1166:53: error: expected ';' at end of declaration
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/c++/v1/atomic:1166:54: error: expected unqualified-id
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
 ^
/usr/include/c++/v1/__config:839:21: note: expanded from macro '_NOEXCEPT'
#  define _NOEXCEPT noexcept
^
In file included from tatomic.cc:13:
In file included from /usr/include/c++/v1/memory:668:
/usr/include/c++/v1/atomic:1174:1: error: redefinition of 
'__atomic_is_lock_free'
atomic_is_lock_free(const atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/stdatomic.h:176:2: note: expanded from macro 'atomic_is_lock_free'
__atomic_is_lock_free(sizeof(*(obj)), obj)
^
/usr/include/c++/v1/atomic:1166:1: note: previous definition is here
atomic_is_lock_free(const volatile atomic<_Tp>* __o) _NOEXCEPT
^
/usr/include/stdatomic.h:176:2: note: expanded from macro 'atomic_is_lock_free'
__atomic_is_lock_free(sizeof(*(obj)), obj)
^
In file included from tatomic.cc:13:
In file included from /usr/include/c++/v1/memory:668:
/usr/include/c++/v1/atomic:1174:40: error: expected ')'
atomic_is_lock_free(const atomic<_Tp>* __o) _NOEXCEPT
   ^
/usr/include/c++/v1/atomic:1174:1: note: to match this '('

Access to NETMAP from c++ program

2019-11-19 Thread Slawa Olhovchenkov

Is this posible (now) for access to NETAMP from C++?
I am see headers conflict:

In file included from /usr/include/net/netmap_user.h:104:
In file included from /usr/include/net/netmap.h:812:
/usr/include/stdatomic.h:141:21: error: reference to 'memory_order' is ambiguous
atomic_thread_fence(memory_order __order __unused)
^
/usr/include/stdatomic.h:134:3: note: candidate found by name lookup is 
'memory_order'
} memory_order;
  ^
/usr/include/c++/v1/atomic:585:3: note: candidate found by name lookup is 
'std::__1::memory_order'
} memory_order;
  ^

Yes, I am need  in C++ program.

Include  before  also don't work, w/ different error.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: haproxy syslog comptible

2019-06-24 Thread Slawa Olhovchenkov

On Mon, Jun 24, 2019 at 05:42:39PM +0300, Slawa Olhovchenkov wrote:

> On Mon, Jun 24, 2019 at 10:35:03AM -0400, Paul Mather wrote:
> 
> > On Jun 24, 2019, at 10:17 AM, Slawa Olhovchenkov  wrote:
> > 
> > > I am use haproxy logged to syslog and have log lines like this:
> > >
> > > Jun 24 17:04:25 ha01 haproxy[32508]: 193.34.87.146:57625  
> > > [24/Jun/2019:17:04:23.277] balancer~ default-pool/main 0/0/0/-1/2012 504  
> > > 194 - - sH-- 888/888/4/4/0 0/0 "POST /vs HTTP/1.1"
> > >
> > > Is this posible to learn syslogd to use mileseconds timestamps?
> > > ___
> > > freebsd-stable@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> > 
> > 
> > Run syslogd with "-O syslog" to get timestamps logged with microsecond  
> > precision (as well as time zones).  You can add that to your  
> > "syslogd_flags" setting in /etc/rc.conf.  (See man syslogd for details.)
> > 
> > Note that the format of syslog entries changes with "-O syslog".  You get  
> > logs like this:
> > 
> > <38>1 2019-04-12T10:43:56.525458-04:00 x.x.net sshd 1253 - -  
> > Received signal 15; terminating.
> > <38>1 2019-04-12T10:48:05.058693-04:00 x.x.net sshd 1238 - - Server 
> >  
> > listening on :: port 22.
> > 
> > 
> > (Note that the precision also depends upon the client application logging  
> > to syslog.)
> 
> I mean you talk about different syslogd, not from FreeBSD:
> 
> syslogd: illegal option -- O
> usage: syslogd [-468ACcdFknosTuv] [-a allowed_peer]
>[-b bind_address] [-f config_file]
>[-l [mode:]path] [-m mark_interval]
>[-P pid_file] [-p log_socket]

Ah, I am see -- I am need syslogd from FreeBSD-12, thx.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: haproxy syslog comptible

2019-06-24 Thread Slawa Olhovchenkov

On Mon, Jun 24, 2019 at 10:35:03AM -0400, Paul Mather wrote:

> On Jun 24, 2019, at 10:17 AM, Slawa Olhovchenkov  wrote:
> 
> > I am use haproxy logged to syslog and have log lines like this:
> >
> > Jun 24 17:04:25 ha01 haproxy[32508]: 193.34.87.146:57625  
> > [24/Jun/2019:17:04:23.277] balancer~ default-pool/main 0/0/0/-1/2012 504  
> > 194 - - sH-- 888/888/4/4/0 0/0 "POST /vs HTTP/1.1"
> >
> > Is this posible to learn syslogd to use mileseconds timestamps?
> > ___
> > freebsd-stable@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 
> 
> Run syslogd with "-O syslog" to get timestamps logged with microsecond  
> precision (as well as time zones).  You can add that to your  
> "syslogd_flags" setting in /etc/rc.conf.  (See man syslogd for details.)
> 
> Note that the format of syslog entries changes with "-O syslog".  You get  
> logs like this:
> 
> <38>1 2019-04-12T10:43:56.525458-04:00 x.x.net sshd 1253 - -  
> Received signal 15; terminating.
> <38>1 2019-04-12T10:48:05.058693-04:00 x.x.net sshd 1238 - - Server  
> listening on :: port 22.
> 
> 
> (Note that the precision also depends upon the client application logging  
> to syslog.)

I mean you talk about different syslogd, not from FreeBSD:

syslogd: illegal option -- O
usage: syslogd [-468ACcdFknosTuv] [-a allowed_peer]
   [-b bind_address] [-f config_file]
   [-l [mode:]path] [-m mark_interval]
   [-P pid_file] [-p log_socket]
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

haproxy syslog comptible

2019-06-24 Thread Slawa Olhovchenkov

I am use haproxy logged to syslog and have log lines like this:

Jun 24 17:04:25 ha01 haproxy[32508]: 193.34.87.146:57625 
[24/Jun/2019:17:04:23.277] balancer~ default-pool/main 0/0/0/-1/2012 504 194 - 
- sH-- 888/888/4/4/0 0/0 "POST /vs HTTP/1.1"

Is this posible to learn syslogd to use mileseconds timestamps?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 11.3-BETA3 Now Available

2019-06-10 Thread Slawa Olhovchenkov

On Mon, Jun 10, 2019 at 03:13:31PM +, Glen Barber wrote:

> On Sat, Jun 08, 2019 at 01:39:49PM +0300, Slawa Olhovchenkov wrote:
> > On Fri, Jun 07, 2019 at 10:26:34PM +, Glen Barber wrote:
> > 
> > > The third BETA build of the 11.3-RELEASE release cycle is now available.
> > 
> > Can some one from re@ do MFC r348772 to 11.3-RELEASE before release?
> > This is important fix.
> 
> The MFC timer for the change in question is 2 weeks, presumably to allow
> time to detect any issues in 13-CURRENT before the merge is done to
> stable/12 and stable/11.  The change in question was committed three
> days ago.
> 
> I have CC'd the original committer, nonetheless.

I am ask about include MFCed commit in RELEASE image, not in stable
trunk.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 11.3-BETA3 Now Available

2019-06-08 Thread Slawa Olhovchenkov

On Fri, Jun 07, 2019 at 10:26:34PM +, Glen Barber wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> The third BETA build of the 11.3-RELEASE release cycle is now available.

Can some one from re@ do MFC r348772 to 11.3-RELEASE before release?
This is important fix.
Thanks
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD-11: Fatal trap 9: general protection fault while in kernel mode (in key_addref())

2019-03-03 Thread Slawa Olhovchenkov

On Wed, Feb 27, 2019 at 11:54:20PM +0300, Slawa Olhovchenkov wrote:

> Is this known issuse?
> 
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 13; apic id = 2a
> instruction pointer = 0x20:0x806b6a94
> stack pointer   = 0x28:0xfe2026e274f0
> frame pointer   = 0x28:0xfe2026e274f0
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 12 (irq295: t5nex0:0a5)
> trap number = 9
> panic: general protection fault
> cpuid = 13
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0x8032667b = 
> db_trace_self_wrapper+0x2b/frame 0xfe2026e27130
> vpanic() at 0x804c2006 = vpanic+0x186/frame 0xfe2026e271b0
> panic() at 0x804c1e73 = panic+0x43/frame 0xfe2026e27210
> trap_fatal() at 0x807503f2 = trap_fatal+0x322/frame 0xfe2026e27260
> trap() at 0x8074fa5e = trap+0x5e/frame 0xfe2026e27420
> calltrap() at 0x80735771 = calltrap+0x8/frame 0xfe2026e27420
> --- trap 0x9, rip = 0x806b6a94, rsp = 0xfe2026e274f0, rbp = 
> 0xfe2026e274f0 ---
> key_addref() at 0x806b6a94 = key_addref+0x4/frame 0xfe2026e274f0
> ipsec_getpcbpolicy() at 0x806b20b9 = ipsec_getpcbpolicy+0x49/frame 
> 0xfe2026e27530
> ipsec4_getpolicy() at 0x806b10a5 = ipsec4_getpolicy+0x25/frame 
> 0xfe2026e275d0
> ipsec4_in_reject() at 0x806b138b = ipsec4_in_reject+0x1b/frame 
> 0xfe2026e27600
> tcp_input() at 0x8066127c = tcp_input+0x97c/frame 0xfe2026e27740
> ip_input() at 0x805e447f = ip_input+0x10f/frame 0xfe2026e277a0
> netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 
> 0xfe2026e277f0
> ether_demux() at 0x805b43ff = ether_demux+0x13f/frame 
> 0xfe2026e27820
> ether_nh_input() at 0x805b506b = ether_nh_input+0x31b/frame 
> 0xfe2026e27880
> netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 
> 0xfe2026e278d0
> ether_input() at 0x805b4676 = ether_input+0x26/frame 
> 0xfe2026e278f0
> t4_eth_rx() at 0x816403b3 = t4_eth_rx+0x103/frame 0xfe2026e27910
> service_iq() at 0x81644886 = service_iq+0x4a6/frame 0xfe2026e279c0
> t4_intr() at 0x81644b3e = t4_intr+0x2e/frame 0xfe2026e279e0
> intr_event_execute_handlers() at 0x804871ac = 
> intr_event_execute_handlers+0xec/frame 0xfe2026e27a20
> ithread_loop() at 0x80487846 = ithread_loop+0xd6/frame 
> 0xfe2026e27a70
> fork_exit() at 0x80484805 = fork_exit+0x85/frame 0xfe2026e27ab0
> fork_trampoline() at 0x80735cae = fork_trampoline+0xe/frame 
> 0xfe2026e27ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> Uptime: 657d14h33m52s

kgdb decode:

Unread portion of the kernel message buffer:


Fatal trap 9: general protection fault while in kernel mode
cpuid = 13; apic id = 2a
instruction pointer = 0x20:0x806b6a94
stack pointer   = 0x28:0xfe2026e274f0
frame pointer   = 0x28:0xfe2026e274f0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq295: t5nex0:0a5)
trap number = 9
panic: general protection fault
cpuid = 13
KDB: stack backtrace:
db_trace_self_wrapper() at 0x8032667b = 
db_trace_self_wrapper+0x2b/frame 0xfe2026e27130
vpanic() at 0x804c2006 = vpanic+0x186/frame 0xfe2026e271b0
panic() at 0x804c1e73 = panic+0x43/frame 0xfe2026e27210
trap_fatal() at 0x807503f2 = trap_fatal+0x322/frame 0xfe2026e27260
trap() at 0x8074fa5e = trap+0x5e/frame 0xfe2026e27420
calltrap() at 0x80735771 = calltrap+0x8/frame 0xfe2026e27420
--- trap 0x9, rip = 0x806b6a94, rsp = 0xfe2026e274f0, rbp = 
0xfe2026e274f0 ---
key_addref() at 0x806b6a94 = key_addref+0x4/frame 0xfe2026e274f0
ipsec_getpcbpolicy() at 0x806b20b9 = ipsec_getpcbpolicy+0x49/frame 
0xfe2026e27530
ipsec4_getpolicy() at 0x806b10a5 = ipsec4_getpolicy+0x25/frame 
0xfe2026e275d0
ipsec4_in_reject() at 0x806b138b = ipsec4_in_reject+0x1b/frame 
0xfe2026e27600
tcp_input() at 0x8066127c = tcp_input+0x97c/frame 0xfe2026e27740
ip_input() at 0x805e447f = ip_input+0x10f/frame 0xfe2026e277a0
netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 
0xfe2026e277f0
ether_demux() at 0x805b43ff = ether_demux+0x13f/frame 0xfe2026e27820
ether_nh_input() at 0x8

FreeBSD-11: Fatal trap 9: general protection fault while in kernel mode (in key_addref())

2019-02-27 Thread Slawa Olhovchenkov

Is this known issuse?

Fatal trap 9: general protection fault while in kernel mode
cpuid = 13; apic id = 2a
instruction pointer = 0x20:0x806b6a94
stack pointer   = 0x28:0xfe2026e274f0
frame pointer   = 0x28:0xfe2026e274f0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq295: t5nex0:0a5)
trap number = 9
panic: general protection fault
cpuid = 13
KDB: stack backtrace:
db_trace_self_wrapper() at 0x8032667b = 
db_trace_self_wrapper+0x2b/frame 0xfe2026e27130
vpanic() at 0x804c2006 = vpanic+0x186/frame 0xfe2026e271b0
panic() at 0x804c1e73 = panic+0x43/frame 0xfe2026e27210
trap_fatal() at 0x807503f2 = trap_fatal+0x322/frame 0xfe2026e27260
trap() at 0x8074fa5e = trap+0x5e/frame 0xfe2026e27420
calltrap() at 0x80735771 = calltrap+0x8/frame 0xfe2026e27420
--- trap 0x9, rip = 0x806b6a94, rsp = 0xfe2026e274f0, rbp = 
0xfe2026e274f0 ---
key_addref() at 0x806b6a94 = key_addref+0x4/frame 0xfe2026e274f0
ipsec_getpcbpolicy() at 0x806b20b9 = ipsec_getpcbpolicy+0x49/frame 
0xfe2026e27530
ipsec4_getpolicy() at 0x806b10a5 = ipsec4_getpolicy+0x25/frame 
0xfe2026e275d0
ipsec4_in_reject() at 0x806b138b = ipsec4_in_reject+0x1b/frame 
0xfe2026e27600
tcp_input() at 0x8066127c = tcp_input+0x97c/frame 0xfe2026e27740
ip_input() at 0x805e447f = ip_input+0x10f/frame 0xfe2026e277a0
netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 
0xfe2026e277f0
ether_demux() at 0x805b43ff = ether_demux+0x13f/frame 0xfe2026e27820
ether_nh_input() at 0x805b506b = ether_nh_input+0x31b/frame 
0xfe2026e27880
netisr_dispatch_src() at 0x805c4750 = netisr_dispatch_src+0xa0/frame 
0xfe2026e278d0
ether_input() at 0x805b4676 = ether_input+0x26/frame 0xfe2026e278f0
t4_eth_rx() at 0x816403b3 = t4_eth_rx+0x103/frame 0xfe2026e27910
service_iq() at 0x81644886 = service_iq+0x4a6/frame 0xfe2026e279c0
t4_intr() at 0x81644b3e = t4_intr+0x2e/frame 0xfe2026e279e0
intr_event_execute_handlers() at 0x804871ac = 
intr_event_execute_handlers+0xec/frame 0xfe2026e27a20
ithread_loop() at 0x80487846 = ithread_loop+0xd6/frame 
0xfe2026e27a70
fork_exit() at 0x80484805 = fork_exit+0x85/frame 0xfe2026e27ab0
fork_trampoline() at 0x80735cae = fork_trampoline+0xe/frame 
0xfe2026e27ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 657d14h33m52s
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ZFS boot code regression

2019-02-27 Thread Slawa Olhovchenkov

1. gptzfsboot from 12 incompatible w/ loader from 11 ("kernel not found")
2. loader from 12 incomatibe w/ kernel from 11 (ZFS file system unknown)
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

FreeBSD-12 build question

2018-12-06 Thread Slawa Olhovchenkov

1. How I can build release media of FreeBSD-12 on FreeBSD-11 system?
Currenly process failed by 'Abort trap'.

585191  121 -rw---1 root wheel  
   8962048 Dec  5 18:58 ./ldconfig.core
585199  121 -rw---1 root wheel  
   8953856 Dec  5 18:58 
./usr/obj/usr/src/amd64.amd64/mktemp.core
585200  249 -rw---1 root wheel  
   9641984 Dec  5 18:58 ./usr/obj/usr/src/amd64.amd64/make.core

2. How I can update FreeBSD-11 (ZFS on Root) to FreeBSD-12 from source?
With new kernel ZFS not mounted, /bin/sh crashed.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: iostat busy value calculation

2018-06-22 Thread Slawa Olhovchenkov

On Wed, Jun 20, 2018 at 07:37:20PM +0200, Miroslav Lachman wrote:

> > %busy comes from the devstat layer. It's defined as the percent of the 
> > time over the polling interval in which at least one transaction was 
> > awaiting completion by the lower layers. It's an imperfect measure of 
> > how busy the drives are (in ye-olden days, before tagged queuing and 
> > NCQ, it was OK because you had THE transaction pending and it was a good 
> > measure of how utilized things were. Now with concurrent I/O in flash 
> > devices, it's only an imperfect approximation).
> 
> Yes, I am aware of this issue. This percentage is just  "is it slightly 
> loaded or heavily loaded" indicator.

for "heavily loaded" use average transaction time and average queue
length
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS+find(1) wiring all RAM

2018-06-07 Thread Slawa Olhovchenkov

On Thu, Jun 07, 2018 at 07:04:29PM +0930, Shane Ambler wrote:

> On 07/06/2018 16:09, Peter Jeremy wrote:
> > I've noticed that 11-stable/amd64 has been wiring seemingly excessive
> > amounts of RAM for some time (the problem goes back at least 6 months).
> > This extends to getting ENOMEM errors from g_io_deliver() and out-of-swap
> > errors killing processes on a low-memory system.  I'm not sure when it
> > started by it seems to hawe gotten worse between r331535 and r334494.
> 
> Don't know if this will help you at all --
> 
> I have seen excess wired for a few years, since 10.1, I now run
> 11-stable, my experience has seen the severity varying over time.
> 
> I first reported this 28/10/2014 related to heavy disk use on a zpool.
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194654
> 
> The inability to get solid repeatable steps to reproduce have prevented
> me from chasing this more.

Can you try https://reviews.freebsd.org/D7538 ?
I am try in this patch to resolve trouble w/ wired and unused memory
by ARC.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

vmstat -m stranges

2018-04-28 Thread Slawa Olhovchenkov

# vmstat -m|grep temp
 Type InUse MemUse HighUse Requests  Size(s)
 temp60 18014398509481829K   - 32350974  
16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536

Is this normal?

SVN rev: r328463
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: KBI unexpexted change in stable/11 ?

2018-03-29 Thread Slawa Olhovchenkov

On Wed, Mar 28, 2018 at 11:25:08PM -0700, Kevin Oberman wrote:

> > > > r325665 is previos point and is good.
> > > > r331615 crashed.
> > > > Can I use some script for bisect?
> > >
> > > I'm not aware of a script for this.  The only tool I've used is "git
> > > bisect", which is very handy if you're already familiar with git.
> >
> > You may want to try devel/p5-App-SVN-Bisect.  Never used it, so
> > no idea if it's functional or helpful, just found it doing a quick
> > search
> 
> It would be nice if this could be fixed, but it is the case.

r328475 bad (tzdata)
r328469 in progress (kib, sys/vm)
r328463 in progress (don't touch kernel)
r328462 good

I mean r328469 break KBI.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: KBI unexpexted change in stable/11 ?

2018-03-28 Thread Slawa Olhovchenkov

On Wed, Mar 28, 2018 at 10:29:10AM -0500, Eric van Gyzen wrote:

> On 03/28/2018 08:09, Slawa Olhovchenkov wrote:
> > I am upgrade system to latest -STABLE and now see kernel crash:
> > 
> > - loading virtualbox modules build on 11.1-RELEASE-p6
> > - loading nvidia module build on 11.1-RELEASE-p6 and start xdm
> > 
> > Is this expected? I am mean about loading modules builded on
> > 11.1-RELEASE on any 11.1-STABLE.
> 
> This is not expected.  Can you bisect to find the stable/11 commit that
> broke this?
> 
> If you can roll back to 11.1-RELEASE, you could probably just
> buildkernel and installkernel from various points along stable/11.  That
> would save a lot of time by avoiding buildworld.

r325665 is previos point and is good.
r331615 crashed.
Can I use some script for bisect?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: KBI unexpexted change in stable/11 ?

2018-03-28 Thread Slawa Olhovchenkov

On Wed, Mar 28, 2018 at 05:13:48PM +0200, Gregory Byshenk wrote:

> On Wed, Mar 28, 2018 at 05:35:51PM +0300, Slawa Olhovchenkov wrote:
> > On Wed, Mar 28, 2018 at 03:39:46PM +0200, Gregory Byshenk wrote:
> > > 
> > > Did you rebuild your virtualbox and nvidia modules for your new
> > > kernel? If you build a new kernel, then you need to rebuild any
> > > modules that were installed from ports for the new version.
> > 
> > Only for -CURRENT, not for -STABLE:
> > 
> > https://lists.freebsd.org/pipermail/svn-src-all/2018-March/159649.html
> > 
> > John Baldwin: "In theory we try to not break existing kernel
> > modules on a stable branch.  That is, one should be able to kldload an
> > if_iwn.ko built on 11.0 on a 11-stable kernel."
> 
> I understand that this is true for standard kernel modules, but
> I have frequently run into problems loading _ports_ modules on
> a new kernel - including in STABLE.

This is general _kernel_ rule, for loading all modules, mostly for
third-party modules.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: KBI unexpexted change in stable/11 ?

2018-03-28 Thread Slawa Olhovchenkov

On Wed, Mar 28, 2018 at 03:39:46PM +0200, Gregory Byshenk wrote:

> On Wed, Mar 28, 2018 at 04:09:04PM +0300, Slawa Olhovchenkov wrote:
> > I am upgrade system to latest -STABLE and now see kernel crash:
> > 
> > - loading virtualbox modules build on 11.1-RELEASE-p6
> > - loading nvidia module build on 11.1-RELEASE-p6 and start xdm
> > 
> > Is this expected? I am mean about loading modules builded on
> > 11.1-RELEASE on any 11.1-STABLE.
> 
> Did you rebuild your virtualbox and nvidia modules for your new
> kernel? If you build a new kernel, then you need to rebuild any
> modules that were installed from ports for the new version.

Only for -CURRENT, not for -STABLE:

https://lists.freebsd.org/pipermail/svn-src-all/2018-March/159649.html

John Baldwin: "In theory we try to not break existing kernel
modules on a stable branch.  That is, one should be able to kldload an
if_iwn.ko built on 11.0 on a 11-stable kernel."

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

KBI unexpexted change in stable/11 ?

2018-03-28 Thread Slawa Olhovchenkov

I am upgrade system to latest -STABLE and now see kernel crash:

- loading virtualbox modules build on 11.1-RELEASE-p6
- loading nvidia module build on 11.1-RELEASE-p6 and start xdm

Is this expected? I am mean about loading modules builded on
11.1-RELEASE on any 11.1-STABLE.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 11.1 ixl(4) interface does not negotiate at 100 Mbit/s

2018-03-19 Thread Slawa Olhovchenkov

On Mon, Mar 19, 2018 at 03:53:03PM +0100, Patrick M. Hausen wrote:

> Hi all,
> 
> any ideas why a current RELENG_11_1 system with ixl(4)
> onboard interfaces might not negotiate with a switch that
> has only fast ethernet?
> 
> status: no carrieron the host
> line protocol is down (notconnect)on the switch
> 
> dmesg:
> https://imgur.com/9ri9is8

"Intel® Ethernet Controller X710/XXV710/XL710 Feature Support Matrix"
don't show support 100Mb link for any sw release for X710/XL710.

"Intel® Ethernet Connection X722 Feature Support Matrix" don't have
string "100"

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

2017-08-08 Thread Slawa Olhovchenkov

On Tue, Aug 08, 2017 at 01:49:08PM +0200, Hans Petter Selasky wrote:

> On 08/08/17 13:33, Slawa Olhovchenkov wrote:
> > TW_RUNLOCK(V_tw_lock);
> > and
> > if (INP_INFO_TRY_WLOCK(_tcbinfo)) {
> > 
> > `inp` can be invalidated, freed and this pointer may be invalid?
> 
> If you look one line up there is a pcbref ??

Yes.
Can different thread take this inp and freed it?
May be timer thread?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???)

2017-08-08 Thread Slawa Olhovchenkov

On Tue, Aug 08, 2017 at 10:31:33AM +0200, Hans Petter Selasky wrote:

> Here is the conclusion:
> 
> The following code is going in an infinite loop:
> 
> 
> > for (;;) {
> > TW_RLOCK(V_tw_lock);
> > tw = TAILQ_FIRST(_twq_2msl);
> > if (tw == NULL || (!reuse && (tw->tw_time - ticks) > 0)) {
> > TW_RUNLOCK(V_tw_lock);
> > break;
> > }
> > KASSERT(tw->tw_inpcb != NULL, ("%s: tw->tw_inpcb == NULL",
> > __func__));
> > 
> > inp = tw->tw_inpcb;
> > in_pcbref(inp);
> > TW_RUNLOCK(V_tw_lock);
> > 
> > if (INP_INFO_TRY_RLOCK(_tcbinfo)) {
> > 
> > INP_WLOCK(inp);
> > tw = intotw(inp);
> > if (in_pcbrele_wlocked(inp)) {
> 
> in_pcbrele_wlocked() returns (1) because INP_FREED (16) is set in 
> inp->inp_flags2. I guess you have invariants disabled, because the 
> KASSERT() below should have caused a panic.
> 
> > KASSERT(tw == NULL, ("%s: held last inp "
> > "reference but tw not NULL", __func__));
> > INP_INFO_RUNLOCK(_tcbinfo);
> > continue;
> > }
> 
> This is a regression issue after:
> 
> > commit 5630210a7f1dbbd903b77b2aef939cd47c63da58
> > Author: jch 
> > Date:   Thu Oct 30 08:53:56 2014 +
> > 
> > Fix a race condition in TCP timewait between tcp_tw_2msl_reuse() and
> > tcp_tw_2msl_scan().  This race condition drives unplanned timewait
> > timeout cancellation.  Also simplify implementation by holding inpcb
> > reference and removing tcptw reference counting.
> 
> Suggested fix attached.

Hmm, I am not sure, IMHO between

TW_RUNLOCK(V_tw_lock);
and
if (INP_INFO_TRY_WLOCK(_tcbinfo)) {

`inp` can be invalidated, freed and this pointer may be invalid?


> Index: sys/netinet/tcp_timewait.c
> ===
> --- sys/netinet/tcp_timewait.c(revision 321981)
> +++ sys/netinet/tcp_timewait.c(working copy)
> @@ -709,10 +709,11 @@
>   INP_WLOCK(inp);
>   tw = intotw(inp);
>   if (in_pcbrele_wlocked(inp)) {
> - KASSERT(tw == NULL, ("%s: held last inp "
> - "reference but tw not NULL", __func__));
>   INP_INFO_RUNLOCK(_tcbinfo);
> - continue;
> + if (tw == NULL)
> + continue;
> + else
> + break;  /* INP_FREED flag is set */
>   }
>  
>   if (tw == NULL) {

> ___
> freebsd-...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Mega ZFS MFCs

2017-07-27 Thread Slawa Olhovchenkov

On Thu, Jul 27, 2017 at 04:29:52PM +0300, Alexander Motin wrote:

> Hi Mike,
> 
> On 27.07.2017 16:21, Mike Tancsa wrote:
> > I noticed quite a few MFCs to RELENG_11 around zfs yesterday and today.
> > First off, thank you for all these fixes/enhancements! Of the some 60
> > MFCs, are there any particular ones to be more aware of when updating
> > servers ? 
> 
> The most complicated and invasive to me looks r321610 "8021 ARC buf data
> scatter-ization".  It took 5 fix commits to make it behave in head, but
> Andriy told me it should be good now, and I run it on my systems too.
> 
> > Are there any more to come, or is now a good time to test things out ?
> 
> I've merged all we had in head (except couple gptzfsboot commits
> significantly increasing its size, that could break POLA).  Next round
> will any way go to head first, so stable/11 should probably be idle for
> a month at least and should be good for testing now.

Ant chance for PR218043 and D7538?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Lock contention in AIO

2017-04-13 Thread Slawa Olhovchenkov

On Wed, Apr 12, 2017 at 05:21:02PM -0700, Adrian Chadd wrote:

> It's the same pages, right?

Perhaps.

> Is it just the refcounting locking that's
> causing it?

Don't know.

> I think the biggest thing here is to figure out how to have pages have
> a lifecycle where the refcount can be inc/dec (obviously >1, ie not in
> a state where you can dec to 0) via atomics, without grabbing a lock.
> That'll make this particular use case mch faster.
> 
> (dfbsd does this.)

I can try you patch.

> -a
> 
> 
> On 21 March 2017 at 09:42, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
> > I am see lock contetntion cuased by aio read (same file segment from
> > multiple process simultaneous):
> >
> > 07.74%  [26756]lock_delay @ /boot/kernel/kernel
> >  92.21%  [24671] __mtx_lock_sleep
> >   52.14%  [12864]  vm_page_enqueue
> >100.0%  [12864]   vm_fault_hold
> > 87.71%  [11283]vm_fault_quick_hold_pages
> >  100.0%  [11283] vn_io_fault1
> >   100.0%  [11283]  vn_io_fault
> >99.88%  [11270]   aio_process_rw
> > 100.0%  [11270]aio_daemon
> >  100.0%  [11270] fork_exit
> >00.12%  [13]  dofileread
> > 100.0%  [13]   kern_readv
> >
> > Is this know problem?
> > ___
> > freebsd-stable@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

/dev/dri registration

2017-04-03 Thread Slawa Olhovchenkov

I am have strange issuse on stable/10:

# devinfo -v
nexus0
  apic0
  ram0
  acpi0
[...]
pcib0 pnpinfo _HID=PNP0A08 _UID=0 at handle=\_SB_.PCI0
  pci0
hostb0 pnpinfo vendor=0x8086 device=0xd130 subvendor=0x1014 
subdevice=0x03ce class=0x06 at slot=0 function=0 dbsf=pci0:0:0:0
pcib1 pnpinfo vendor=0x8086 device=0xd138 subvendor=0x1014 
subdevice=0x03ce class=0x060400 at slot=3 function=0 dbsf=pci0:0:3:0 
handle=\_SB_.PCI0.P0P2
  pci1
vgapci0 pnpinfo vendor=0x10de device=0x0a20 subvendor=0x1458 
subdevice=0x34d6 class=0x03 at slot=0 function=0 dbsf=pci0:1:0:0
  drm0
  drmn0
  nvidia0

But /dev/dri don't exist!

# kldstat 
Id Refs AddressSize Name
 1   80 0x8020 17e87f8  kernel
 21 0x819e9000 309780   zfs.ko
 32 0x81cf3000 6040 opensolaris.ko
 41 0x81cfa000 7aa58if_em.ko
 51 0x81d75000 29bd0drm.ko
 61 0x81d9f000 82898drm2.ko
 72 0x81e22000 6298 iicbus.ko
 81 0x81e29000 1c650uart.ko
 91 0x82011000 56f3 fdescfs.ko
101 0x82017000 a681 linprocfs.ko
113 0x82022000 7522 linux_common.ko
121 0x8202a000 5673 linsysfs.ko
131 0x8203 364c ums.ko
141 0x82034000 10226snd_uaudio.ko
151 0x82045000 2ba8 uhid.ko
163 0x82048000 4e626vboxdrv.ko
172 0x82097000 2b82 vboxnetflt.ko
182 0x8209a000 ba2f netgraph.ko
191 0x820a6000 414f ng_ether.ko
201 0x820ab000 3fd4 vboxnetadp.ko
212 0x820af000 3d5dalinux.ko
221 0x820ed000 964496   nvidia.ko

What is wrong in may setup?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Lock contention in AIO

2017-03-21 Thread Slawa Olhovchenkov

I am see lock contetntion cuased by aio read (same file segment from
multiple process simultaneous):

07.74%  [26756]lock_delay @ /boot/kernel/kernel
 92.21%  [24671] __mtx_lock_sleep
  52.14%  [12864]  vm_page_enqueue
   100.0%  [12864]   vm_fault_hold
87.71%  [11283]vm_fault_quick_hold_pages
 100.0%  [11283] vn_io_fault1
  100.0%  [11283]  vn_io_fault
   99.88%  [11270]   aio_process_rw
100.0%  [11270]aio_daemon
 100.0%  [11270] fork_exit
   00.12%  [13]  dofileread
100.0%  [13]   kern_readv

Is this know problem?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: about that DFBSD performance test

2017-03-08 Thread Slawa Olhovchenkov

On Wed, Mar 08, 2017 at 09:00:34AM +0500, Eugene M. Zheganin wrote:

> Hi.
> 
> Some have probably seen this already - 
> http://lists.dragonflybsd.org/pipermail/users/2017-March/313254.html
> 
> So, could anyone explain why FreeBSD was owned that much. Test is split  
> into two parts, one is nginx part, and the other is the IPv4 forwarding 

three: UFS part. And multiple simulations access to same file/block
can cause page lock congestion.

> part. I understand that nginx ownage was due to SO_REUSEPORT feature, 
> which we do formally have, but in DFBSD and Linux it does provide a 
> kernel socket multiplexor, which eliminates locking, and ours does not. 
> I have only found traces of discussion that DFBSD implementation is too 
> hackish. Well, hackish or not, but it's 4 times faster, as it turns out. 
> The IPv4 forwarding loss is pure defeat though.
> 
> Please not that although they use HEAD it these tests, they also mention 
> that this is the GENERIC-NODEBUG kernel which means this isn't related 
> to the WITNESS stuff.
> 
> Please also don't consider this trolling, I'm a big FreeBSD fan through 
> the years, so I'm asking because I'm kind of concerned.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: slow machine, swap in use, but more than 5GB of RAM inactive

2017-03-06 Thread Slawa Olhovchenkov

On Tue, Mar 07, 2017 at 10:19:35AM +0800, Erich Dollansky wrote:

> Hi,
> 
> I wonder about the slow speed of my machine while top shows ample
> inactive memory:
> 
> last pid: 85287;  load averages:  2.56,  2.44, 1.68 
> up 6+10:24:45 10:13:36 191 processes: 5 running, 186 sleeping 
> CPU 0: 47.1% user,  0.0% nice, 51.4% system, 0.0% interrupt,  1.6% idle 
> CPU 1: 38.4% user,  0.0% nice, 60.4% system,  0.0% interrupt, 1.2% idle 
> CPU 2: 38.8% user,  0.0% nice, 59.2% system,  0.0% interrupt, 2.0% idle 
> CPU 3: 45.5% user,  0.0% nice, 51.0% system,  0.4% interrupt, 3.1% idle 
> Mem: 677M Active, 5600M Inact, 1083M Wired, 178M Cache, 816M Buf,301M
> Free 
> Swap: 16G Total, 1352M Used, 15G Free, 8% Inuse
> 
> The swap space in use can be explained by large compilations done
> recently. Why is the inactive memory not put to use.
> 
> I do not want to restart the machine. So, if I could help find the
> source of the problem, I would do.

inactive is not 'not used' memory.
this is just pages don't touched in last 10(?) seconds, but all of
this allocated (such as malloc, mmap, sendfile) to application
(userland programs).
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Is it known problem, that zfs.ko could not be built with system compiler (clang 3.9.1) without optimization?

2017-02-22 Thread Slawa Olhovchenkov

On Wed, Feb 22, 2017 at 11:47:42PM +0300, Lev Serebryakov wrote:

> Hello Freebsd-stable,
> 
>Now if you build zfs.ko with -O0 it panics on boot.
> 
>If you use default optimization level, a lot of fbt DTreace probes are
>   missing.

Is this related to http://llvm.org/bugs/show_bug.cgi?id=18420 ?

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 11.0-STABLE #0 r310265 amd64 seems to be cpi-ing garbage to mounted FAT32 fs after 10-20 GB.

2017-02-01 Thread Slawa Olhovchenkov

On Wed, Feb 01, 2017 at 07:25:18AM -0700, Jakub Lach wrote:

> I would think so, if only I would not clone the disk/system via the same USB
> port mere weeks ago.
> Moreover, sysutils/f3 fully writes and validates (checksums) 30G+ memory
> cards via the same port without problems.

In my case controller don't always be broken, only from some time.
Data corruption over my USB depends on data access pattern.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD 11.0-STABLE #0 r310265 amd64 seems to be cpi-ing garbage to mounted FAT32 fs after 10-20 GB.

2017-02-01 Thread Slawa Olhovchenkov

On Wed, Feb 01, 2017 at 06:52:01AM -0700, Jakub Lach wrote:

> Yes, HDD and card reader was USB mounted.
> 
> This time, I've copied about 12G from 38G from internal SSD (UFS2) to 
> HDD via USB (FAT32), then system panicked with CAM errors.

I am have like issuse on laptop w/ broken USB controller.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

LACP: Fatal trap 18: integer divide fault while in kernel mode

2017-01-28 Thread Slawa Olhovchenkov

I am got panic on recent stable:

Fatal trap 18: integer divide fault while in kernel mode
cpuid = 3; apic id = 06
instruction pointer = 0x20:0x81453230
stack pointer   = 0x28:0xfe3e56f46480
frame pointer   = 0x28:0xfe3e56f464a0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (3))
trap number = 18
panic: integer divide fault
cpuid = 3
KDB: stack backtrace:
db_trace_self_wrapper() at 0x8032b3eb = 
db_trace_self_wrapper+0x2b/frame 0xfe3e56f460c0
vpanic() at 0x804e33a6 = vpanic+0x186/frame 0xfe3e56f46140
panic() at 0x804e3213 = panic+0x43/frame 0xfe3e56f461a0
trap_fatal() at 0x807b07c2 = trap_fatal+0x322/frame 0xfe3e56f461f0
trap() at 0x807b0475 = trap+0x6b5/frame 0xfe3e56f463b0
calltrap() at 0x807946b1 = calltrap+0x8/frame 0xfe3e56f463b0
--- trap 0x12, rip = 0x81453230, rsp = 0xfe3e56f46480, rbp = 
0xfe3e56f464a0 ---
lacp_select_tx_port() at 0x81453230 = lacp_select_tx_port+0x70/frame 
0xfe3e56f464a0
lagg_lacp_start() at 0x814504ae = lagg_lacp_start+0xe/frame 
0xfe3e56f464c0
lagg_transmit() at 0x8144e73f = lagg_transmit+0xbf/frame 
0xfe3e56f46530
ether_output() at 0x805f30bc = ether_output+0x71c/frame 
0xfe3e56f465d0
ip_output() at 0x80629935 = ip_output+0x1585/frame 0xfe3e56f46720
tcp_output() at 0x806b9e16 = tcp_output+0x1876/frame 0xfe3e56f468c0
tcp_timer_rexmt() at 0x806c572f = tcp_timer_rexmt+0x4df/frame 
0xfe3e56f46900
softclock_call_cc() at 0x804fd1b6 = softclock_call_cc+0x156/frame 
0xfe3e56f469b0
softclock() at 0x804fd754 = softclock+0x94/frame 0xfe3e56f469e0
intr_event_execute_handlers() at 0x8049d15f = 
intr_event_execute_handlers+0x20f/frame 0xfe3e56f46a20
ithread_loop() at 0x8049d766 = ithread_loop+0xc6/frame 
0xfe3e56f46a70
fork_exit() at 0x80499e25 = fork_exit+0x85/frame 0xfe3e56f46ab0
fork_trampoline() at 0x80794bee = fork_trampoline+0xe/frame 
0xfe3e56f46ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

(kgdb) info line *0x81453230
Line 848 of "/usr/src/sys/modules/if_lagg/../../net/ieee8023ad_lacp.c" starts 
at address 0x8145322e  and ends at 
0x81453233 .

Do I need to create PR?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: decent 40G network adapters

2017-01-18 Thread Slawa Olhovchenkov

On Wed, Jan 18, 2017 at 02:48:19PM +0500, Eugene M. Zheganin wrote:

> Hi.
> 
> Could someone recommend a decent 40Gbit adapter that are proven to be
> working under FreeBSD ? The intended purpose - iSCSI traffic, not much
> pps, but rates definitely above 10G. I've tried Supermicro-manufactured
> Intel XL710 ones (two boards, different servers - same sad story:
> packets loss, server unresponsive, spikes), seems like they have a
> problem in a driver (or firmware), and though Intel support states this
> is because the Supermicro tampered with the adapter, I'm still
> suspicious about ixl(4). I've also seen in the ML a guy reported the
> exact same problem with ixl(4) as I have found.
> 
> So, what would you say ? Chelsio ?

I am use Chelsio and Solarflare.
Not sure about you workload -- I am have 40K+ TCP connections, you
workload need different tuning.
Do you planed to utilise both ports?
For this case you need PCIe 16x card. This is Chelsio T6 and
Solarflare 9200. 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: buildworld build times 10-stable vs. 11-stable

2017-01-15 Thread Slawa Olhovchenkov

On Sun, Jan 15, 2017 at 10:40:42AM -0600, Dan Mack wrote:

> I have a system which builds world, kernel, install, boot, installworld, 
> reboot several times per week.   I just noticed that my build times 
> increased from about (just cherry picking a couple build logs):
> 
>Starting build of FreeBSD SVN [309852]  10.3-STABLE
>Kernel will be GENERIC
>  building world ...   90:35 0
> 
> 
>Starting build of FreeBSD SVN [312099]  11.0-STABLE
>Kernel will be GENERIC
>  building world ...   146:23 0
> 
> before I start bisecting the log files, is there something obvious 
> introduced in 11 that I missed that would explain the roughly 50 minute 
> difference in my build times?   clang?  additional subsystems?

lldb/clang and related.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

dev.cpu.0.freq/dev.cpu.0.freq_levels support on E5v4

2017-01-14 Thread Slawa Olhovchenkov

I am have stable/11 and E5v4.
I am don't see cpufreq support by sysctl:

# sysctl dev.cpu.0
dev.cpu.0.cx_method: C1/hlt
dev.cpu.0.cx_usage_counters: 61755
dev.cpu.0.cx_usage: 100.00% last 1us
dev.cpu.0.cx_lowest: C2
dev.cpu.0.cx_supported: C1/1/1
dev.cpu.0.%parent: acpi0
dev.cpu.0.%pnpinfo: _HID=ACPI0007 _UID=0
dev.cpu.0.%location: handle=\_SB_.SCK0.CP00 _PXM=0
dev.cpu.0.%driver: cpu
dev.cpu.0.%desc: ACPI CPU

# grep -i freq /var/run/dmesg.boot
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 340
Event timer "HPET2" frequency 14318180 Hz quality 340
Event timer "HPET3" frequency 14318180 Hz quality 340
Event timer "HPET4" frequency 14318180 Hz quality 340
Event timer "HPET5" frequency 14318180 Hz quality 340
Event timer "HPET6" frequency 14318180 Hz quality 340
Event timer "HPET7" frequency 14318180 Hz quality 340
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
est0:  on cpu0
est1:  on cpu1
est2:  on cpu2
est3:  on cpu3
est4:  on cpu4
est5:  on cpu5
est6:  on cpu6
est7:  on cpu7
est8:  on cpu8
est9:  on cpu9
est10:  on cpu10
est11:  on cpu11
est12:  on cpu12
est13:  on cpu13
est14:  on cpu14
est15:  on cpu15
est16:  on cpu16
est17:  on cpu17
est18:  on cpu18
est19:  on cpu19
est20:  on cpu20
est21:  on cpu21
est22:  on cpu22
est23:  on cpu23
Timecounter "TSC-low" frequency 1100023294 Hz quality 1000

How to enable cpufreq support for powerd?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)

2016-12-17 Thread Slawa Olhovchenkov

On Sat, Dec 17, 2016 at 05:12:13PM +1100, Ian Smith wrote:

> On Fri, 16 Dec 2016 18:08:34 +0100, Fernando Herrero Carrón wrote:
>  > Hi everyone,
> 
> Hi,
> 
> you've had plenty of helpful responses, but nobody has commented on:
> 
>  > My only reason for wanting to boot with UEFI is faster boot, 
>  > everything is working fine otherwise.
> 
> I'm skeptical that UEFI boot would be any or noticeably faster than via 
> BIOS, but am interested in hearing of any experiences regarding that.

Some BIOS start with very long try UEFI boot atempt and try legacy
boot only all of that fails.

I.e. this is not speedup FreeBSD boot, this is speedup _start_ of
FreeBSD boot for some BIOS.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)

2016-12-16 Thread Slawa Olhovchenkov

On Fri, Dec 16, 2016 at 11:43:18AM -0600, Eric van Gyzen wrote:

> On 12/16/2016 11:39, Slawa Olhovchenkov wrote:
> > On Fri, Dec 16, 2016 at 06:08:34PM +0100, Fernando Herrero Carrón wrote:
> > 
> >> Hi everyone,
> >>
> >> A few months ago I got myself a new box and I have been happily running
> >> FreeBSD on it ever since. I noticed that the boot was not as fast as I had
> >> expected and I've realized that, while my disk is GPT partitioned, the boot
> >> process is still BIOS based:
> >>
> >> % gpart show
> >> =>   34  976773101  ada0  GPT  (466G)
> >>  34  6- free -  (3.0K)
> >>  40   1024 1  freebsd-boot  (512K)
> >>1064984- free -  (492K)
> >>2048   67108864 2  freebsd-swap  (32G)
> >>67110912  909662208 3  freebsd-zfs  (434G)
> >>   976773120 15- free -  (7.5K)
> >>
> >> I am reading uefi(8) and it looks like FreeBSD 11 should be able to boot
> >> using UEFI straight into ZFS, so I am thinking of converting that
> >> freebsd-boot partition to an EFI partition, creating a FAT filesystem and
> >> copying /boot/boot.efi there.
> >>
> >> How good of an idea is that? Would it really be that simple or am I missing
> >> something? My only reason for wanting to boot with UEFI is faster boot,
> >> everything is working fine otherwise.
> >>
> >> Thanks in advance for your help.
> > 
> > I am also interesting by this case.
> > I think expand freebsd-boot to about 1M (size of /boot/boot1.efifat),
> > dding /boot/boot1.efifat and set to type to 'efi' may be enough. I am
> > never tried this.
> 
> I expect that would work.  It's slightly risky, though, since it doesn't let 
> you
> fall back to BIOS boot if EFI doesn't work.

Live cd/USB can be fallback for this case.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Upgrading boot from GPT(BIOS) to GPT(UEFI)

2016-12-16 Thread Slawa Olhovchenkov

On Fri, Dec 16, 2016 at 06:08:34PM +0100, Fernando Herrero Carrón wrote:

> Hi everyone,
> 
> A few months ago I got myself a new box and I have been happily running
> FreeBSD on it ever since. I noticed that the boot was not as fast as I had
> expected and I've realized that, while my disk is GPT partitioned, the boot
> process is still BIOS based:
> 
> % gpart show
> =>   34  976773101  ada0  GPT  (466G)
>  34  6- free -  (3.0K)
>  40   1024 1  freebsd-boot  (512K)
>1064984- free -  (492K)
>2048   67108864 2  freebsd-swap  (32G)
>67110912  909662208 3  freebsd-zfs  (434G)
>   976773120 15- free -  (7.5K)
> 
> I am reading uefi(8) and it looks like FreeBSD 11 should be able to boot
> using UEFI straight into ZFS, so I am thinking of converting that
> freebsd-boot partition to an EFI partition, creating a FAT filesystem and
> copying /boot/boot.efi there.
> 
> How good of an idea is that? Would it really be that simple or am I missing
> something? My only reason for wanting to boot with UEFI is faster boot,
> everything is working fine otherwise.
> 
> Thanks in advance for your help.

I am also interesting by this case.
I think expand freebsd-boot to about 1M (size of /boot/boot1.efifat),
dding /boot/boot1.efifat and set to type to 'efi' may be enough. I am
never tried this.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: gdb broken on stable/11 and current?

2016-12-08 Thread Slawa Olhovchenkov

On Thu, Dec 08, 2016 at 07:56:03PM +0200, Andriy Gapon wrote:

> On 08/12/2016 18:57, Slawa Olhovchenkov wrote:
> > kgdb7111 don't find .debug under /usr/lib/debug/
> > gdb found it.
> 
> $ gdb7111 bhyve /var/coredumps/bhyve.0.0.core
> GNU gdb (GDB) 7.11.1 [GDB v7.11.1 for FreeBSD]
> Copyright (C) 2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-portbld-freebsd11.0".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from bhyve...Reading symbols from
> /usr/lib/debug//usr/sbin/bhyve.debug...done.
> 
> Something is wrong in your environment.

May be outdated information, last unsuccess try will be about august.
Now work ok.
Thanks for point.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: gdb broken on stable/11 and current?

2016-12-08 Thread Slawa Olhovchenkov

On Thu, Dec 08, 2016 at 04:52:35PM +, K. Macy wrote:

> kgdb7111 is what you use for kernel. It works fine for me.

kgdb7111 don't find .debug under /usr/lib/debug/
gdb found it.

> On Thu, Dec 8, 2016 at 08:29 Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
> 
> > On Thu, Dec 08, 2016 at 04:01:04PM +, K. Macy wrote:
> >
> >
> >
> > > In tree gdb doesn't work for much of anything these days. It can't even
> >
> > > consistently give a complete kernel backtrace. Jhb is graciously
> >
> > > maintaining gdb in ports. It will be installed as the awkwardly named
> >
> > > gdb7111 IIRC.
> >
> >
> >
> > 1. gdb7111 badly integrated w/ 11 and up (don't see kernel debug
> >
> > symbols)
> >
> > 2. all included in base systems can't be core dumped.
> >
> >
> >
> > > On Thu, Dec 8, 2016 at 06:53 Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
> >
> > >
> >
> > > > % gdb ./edge_stat
> >
> > > >
> >
> > > > GNU gdb 6.1.1 [FreeBSD]
> >
> > > >
> >
> > > > Copyright 2004 Free Software Foundation, Inc.
> >
> > > >
> >
> > > > GDB is free software, covered by the GNU General Public License, and
> > you
> >
> > > > are
> >
> > > >
> >
> > > > welcome to change it and/or distribute copies of it under certain
> >
> > > > conditions.
> >
> > > >
> >
> > > > Type "show copying" to see the conditions.
> >
> > > >
> >
> > > > There is absolutely no warranty for GDB.  Type "show warranty" for
> > details.
> >
> > > >
> >
> > > > This GDB was configured as "amd64-marcel-freebsd"...
> >
> > > >
> >
> > > > (gdb) break main
> >
> > > >
> >
> > > > Segmentation fault (core dumped)
> >
> > > >
> >
> > > >
> >
> > > >
> >
> > > > % gdb /usr/bin/gdb /tmp/gdb.13573.core
> >
> > > >
> >
> > > > GNU gdb 6.1.1 [FreeBSD]
> >
> > > >
> >
> > > > Copyright 2004 Free Software Foundation, Inc.
> >
> > > >
> >
> > > > GDB is free software, covered by the GNU General Public License, and
> > you
> >
> > > > are
> >
> > > >
> >
> > > > welcome to change it and/or distribute copies of it under certain
> >
> > > > conditions.
> >
> > > >
> >
> > > > Type "show copying" to see the conditions.
> >
> > > >
> >
> > > > There is absolutely no warranty for GDB.  Type "show warranty" for
> > details.
> >
> > > >
> >
> > > > This GDB was configured as "amd64-marcel-freebsd"...(no debugging
> > symbols
> >
> > > > found)...
> >
> > > >
> >
> > > > Core was generated by `gdb'.
> >
> > > >
> >
> > > > Program terminated with signal 11, Segmentation fault.
> >
> > > >
> >
> > > > Reading symbols from /lib/libm.so.5...(no debugging symbols
> > found)...done.
> >
> > > >
> >
> > > > Loaded symbols for /lib/libm.so.5
> >
> > > >
> >
> > > > Reading symbols from /lib/libncursesw.so.8...(no debugging symbols
> >
> > > > found)...done.
> >
> > > >
> >
> > > > Loaded symbols for /lib/libncursesw.so.8
> >
> > > >
> >
> > > > Reading symbols from /usr/lib/libgnuregex.so.5...(no debugging symbols
> >
> > > > found)...done.
> >
> > > >
> >
> > > > Loaded symbols for /usr/lib/libgnuregex.so.5
> >
> > > >
> >
> > > > Reading symbols from /lib/libc.so.7...(no debugging symbols
> > found)...done.
> >
> > > >
> >
> > > > Loaded symbols for /lib/libc.so.7
> >
> > > >
> >
> > > > Reading symbols from /usr/lib/libthread_db.so...(no debugging symbols
> >
> > > > found)...done.
> >
> > > >
> >
> > > > Loaded symbols for /usr/lib/libthread_db.so
> >
> > > >
> >
> > > > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols

Re: gdb broken on stable/11 and current?

2016-12-08 Thread Slawa Olhovchenkov

On Thu, Dec 08, 2016 at 04:01:04PM +, K. Macy wrote:

> In tree gdb doesn't work for much of anything these days. It can't even
> consistently give a complete kernel backtrace. Jhb is graciously
> maintaining gdb in ports. It will be installed as the awkwardly named
> gdb7111 IIRC.

1. gdb7111 badly integrated w/ 11 and up (don't see kernel debug
symbols)
2. all included in base systems can't be core dumped.

> On Thu, Dec 8, 2016 at 06:53 Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
> 
> > % gdb ./edge_stat
> >
> > GNU gdb 6.1.1 [FreeBSD]
> >
> > Copyright 2004 Free Software Foundation, Inc.
> >
> > GDB is free software, covered by the GNU General Public License, and you
> > are
> >
> > welcome to change it and/or distribute copies of it under certain
> > conditions.
> >
> > Type "show copying" to see the conditions.
> >
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> >
> > This GDB was configured as "amd64-marcel-freebsd"...
> >
> > (gdb) break main
> >
> > Segmentation fault (core dumped)
> >
> >
> >
> > % gdb /usr/bin/gdb /tmp/gdb.13573.core
> >
> > GNU gdb 6.1.1 [FreeBSD]
> >
> > Copyright 2004 Free Software Foundation, Inc.
> >
> > GDB is free software, covered by the GNU General Public License, and you
> > are
> >
> > welcome to change it and/or distribute copies of it under certain
> > conditions.
> >
> > Type "show copying" to see the conditions.
> >
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> >
> > This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols
> > found)...
> >
> > Core was generated by `gdb'.
> >
> > Program terminated with signal 11, Segmentation fault.
> >
> > Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done.
> >
> > Loaded symbols for /lib/libm.so.5
> >
> > Reading symbols from /lib/libncursesw.so.8...(no debugging symbols
> > found)...done.
> >
> > Loaded symbols for /lib/libncursesw.so.8
> >
> > Reading symbols from /usr/lib/libgnuregex.so.5...(no debugging symbols
> > found)...done.
> >
> > Loaded symbols for /usr/lib/libgnuregex.so.5
> >
> > Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
> >
> > Loaded symbols for /lib/libc.so.7
> >
> > Reading symbols from /usr/lib/libthread_db.so...(no debugging symbols
> > found)...done.
> >
> > Loaded symbols for /usr/lib/libthread_db.so
> >
> > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols
> > found)...done.
> >
> > Loaded symbols for /libexec/ld-elf.so.1
> >
> > #0  0x005da00b in cplus_demangle_v3_callback ()
> >
> > (gdb) bt
> >
> > #0  0x005da00b in cplus_demangle_v3_callback ()
> >
> > #1  0x005d9f9c in cplus_demangle_v3 ()
> >
> > #2  0x005ca13c in cplus_demangle ()
> >
> > #3  0x00487454 in class_name_from_physname ()
> >
> > #4  0x0053a4f3 in dwarf2_read_section ()
> >
> > #5  0x0053a0cc in dwarf2_read_section ()
> >
> > #6  0x00539bd9 in dwarf2_read_section ()
> >
> > #7  0x00537395 in dwarf2_read_section ()
> >
> > #8  0x00539c21 in dwarf2_read_section ()
> >
> > #9  0x00539643 in dwarf2_read_section ()
> >
> > #10 0x00538a6c in dwarf2_read_section ()
> >
> > #11 0x005352fb in dwarf2_read_section ()
> >
> > #12 0x00533bfd in dwarf2_read_section ()
> >
> > #13 0x004cfc46 in psymtab_to_symtab ()
> >
> > #14 0x004c9cfb in lookup_symbol_global ()
> >
> > #15 0x00482273 in cp_lookup_symbol_namespace ()
> >
> > #16 0x00482059 in cp_lookup_symbol_nonlocal ()
> >
> > #17 0x00481f63 in cp_lookup_symbol_nonlocal ()
> >
> > #18 0x004c9780 in lookup_symbol ()
> >
> > #19 0x005035cf in find_imps ()
> >
> > #20 0x00514df9 in decode_line_1 ()
> >
> > #21 0x00514589 in decode_line_1 ()
> >
> > #22 0x0046eff9 in _initialize_breakpoint ()
> >
> > #23 0x0046f48b in _initialize_breakpoint ()
> >
> > #24 0x004ab289 in catch_exceptions ()
> >
> > #25 0x004ab368 in catch_exceptions_with_msg ()
> >
> > #26 0x000

gdb broken on stable/11 and current?

2016-12-08 Thread Slawa Olhovchenkov

% gdb ./edge_stat
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) break main
Segmentation fault (core dumped)

% gdb /usr/bin/gdb /tmp/gdb.13573.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols 
found)...
Core was generated by `gdb'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libncursesw.so.8...(no debugging symbols 
found)...done.
Loaded symbols for /lib/libncursesw.so.8
Reading symbols from /usr/lib/libgnuregex.so.5...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib/libgnuregex.so.5
Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /usr/lib/libthread_db.so...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib/libthread_db.so
Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x005da00b in cplus_demangle_v3_callback ()
(gdb) bt
#0  0x005da00b in cplus_demangle_v3_callback ()
#1  0x005d9f9c in cplus_demangle_v3 ()
#2  0x005ca13c in cplus_demangle ()
#3  0x00487454 in class_name_from_physname ()
#4  0x0053a4f3 in dwarf2_read_section ()
#5  0x0053a0cc in dwarf2_read_section ()
#6  0x00539bd9 in dwarf2_read_section ()
#7  0x00537395 in dwarf2_read_section ()
#8  0x00539c21 in dwarf2_read_section ()
#9  0x00539643 in dwarf2_read_section ()
#10 0x00538a6c in dwarf2_read_section ()
#11 0x005352fb in dwarf2_read_section ()
#12 0x00533bfd in dwarf2_read_section ()
#13 0x004cfc46 in psymtab_to_symtab ()
#14 0x004c9cfb in lookup_symbol_global ()
#15 0x00482273 in cp_lookup_symbol_namespace ()
#16 0x00482059 in cp_lookup_symbol_nonlocal ()
#17 0x00481f63 in cp_lookup_symbol_nonlocal ()
#18 0x004c9780 in lookup_symbol ()
#19 0x005035cf in find_imps ()
#20 0x00514df9 in decode_line_1 ()
#21 0x00514589 in decode_line_1 ()
#22 0x0046eff9 in _initialize_breakpoint ()
#23 0x0046f48b in _initialize_breakpoint ()
#24 0x004ab289 in catch_exceptions ()
#25 0x004ab368 in catch_exceptions_with_msg ()
#26 0x0046b169 in break_command ()
#27 0x004ab996 in execute_command ()
#28 0x00465293 in gdb_disable_readline ()
#29 0x00465182 in gdb_setup_readline ()
#30 0x005e266f in rl_callback_read_char ()
#31 0x00464c79 in gdb_setup_readline ()
#32 0x00465a46 in gdb_do_one_event ()
#33 0x004ab289 in catch_exceptions ()
#34 0x004ab42a in catch_errors ()
#35 0x00551168 in _initialize_tui_interp ()
#36 0x00448689 in gdb_main ()
#37 0x004ab289 in catch_exceptions ()
#38 0x004ab42a in catch_errors ()
#39 0x00448526 in gdb_main ()
#40 0x004ab289 in catch_exceptions ()
#41 0x004ab42a in catch_errors ()
#42 0x00447977 in gdb_main ()
#43 0x00447931 in main ()
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs, a directory that used to hold lot of files and listing pause

2016-10-21 Thread Slawa Olhovchenkov

On Fri, Oct 21, 2016 at 01:47:08PM +0100, Pete French wrote:

> > In bad case metadata of every file will be placed in random place of disk.
> > ls need access to metadata of every file before start of output listing.
> 
> Umm, are we not talkong abut an issue where the directoyr no longer contains
> any files. It used to have lots, now it has none.
> 
> > I.e. in bad case you will be need tens of thousands seeks over disk
> > capable only 72 seeks per seconds.
> 
> Why does it need to seek all over the disc if there are no files (and hence
> no metadata surely) ?
> 
> I am not bothered if a hufge directoyr takes a while to list,
> thats something I am happy to deal with. What I dont like is
> when it is back down to zero that it still takes a long time
> to list. That doesnt make much sense.

OK, this case may be differ.
May be zdb can help.
ls -li /parent/dir
Take inode number
zdb - zfs_set inode_number

also do ktrace ls and anaylyse `kdump -E`
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs, a directory that used to hold lot of files and listing pause

2016-10-21 Thread Slawa Olhovchenkov

On Fri, Oct 21, 2016 at 04:51:36PM +0500, Eugene M. Zheganin wrote:

> Hi.
> 
> On 21.10.2016 15:20, Slawa Olhovchenkov wrote:
> >
> > ZFS prefetch affect performance dpeneds of workload (independed of RAM
> > size): for some workloads wins, for some workloads lose (for my
> > workload prefetch is lose and manualy disabled with 128GB RAM).
> >
> > Anyway, this system have only 24MB in ARC by 2.3GB free, this is may
> > be too low for this workload.
> You mean - "for getting a list of a directory with 20 subdirectories" ? 
> Why then does only this directory have this issue with pause, not 
> /usr/ports/..., which has more directories in it ?
> 
> (and yes, /usr/ports/www isn't empty and holds 2410 entities)
> 
> /usr/bin/time -h ls -1 /usr/ports/www
> [...]
> 0.14s real  0.00s user  0.00s sys

You wrote: "(tens of thousands) files".

In bad case metadata of every file will be placed in random place of
disk.
ls need access to metadata of every file before start of output
listing.
I.e. in bad case you will be need tens of thousands seeks over disk
capable only 72 seeks per seconds.

Perhaps /usr/ports/www created at once and metadata of all
entries placed near each other, need less seeks.

If zfs property primarycache/secondarycache not off.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: zfs, a directory that used to hold lot of files and listing pause

2016-10-21 Thread Slawa Olhovchenkov

On Fri, Oct 21, 2016 at 11:02:57AM +0100, Steven Hartland wrote:

> > Mem: 21M Active, 646M Inact, 931M Wired, 2311M Free
> > ARC: 73M Total, 3396K MFU, 21M MRU, 545K Anon, 1292K Header, 47M Other
> > Swap: 4096M Total, 4096M Free
> >
> >   PID USERNAME   PRI NICE   SIZERES STATE   C   TIMEWCPU COMMAND
> >   600 root390 27564K  5072K nanslp  1 295.0H  24.56% monit
> > 0 root   -170 0K  2608K -   1  75:24   0.00% 
> > kernel{zio_write_issue}
> >   767 freeswitch  200   139M 31668K uwait   0  48:29   0.00% 
> > freeswitch{freeswitch}
> >   683 asterisk200   806M   483M uwait   0  41:09   0.00% 
> > asterisk{asterisk}
> > 0 root-80 0K  2608K -   0  37:43   0.00% 
> > kernel{metaslab_group_t}
> > [... others lines are just 0% ...]
> This looks like you only have ~4Gb ram which is pretty low for ZFS I 
> suspect vfs.zfs.prefetch_disable will be 1, which will crash the 
> performance.

ZFS prefetch affect performance dpeneds of workload (independed of RAM
size): for some workloads wins, for some workloads lose (for my
workload prefetch is lose and manualy disabled with 128GB RAM).

Anyway, this system have only 24MB in ARC by 2.3GB free, this is may
be too low for this workload.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: tcsh is not handled correctly UTF-8 in arguments

2016-10-20 Thread Slawa Olhovchenkov

On Thu, Oct 20, 2016 at 08:54:05AM -0600, Alan Somers wrote:

> On Wed, Oct 19, 2016 at 11:10 AM, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
> > tcsh called by sshd for invocation of scp: `tcsh -c scp -f Расписание.pdf`
> > At this time no any LC_* is set.
> > tcsh read .cshrc and set LC_CTYPE=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8.
> > After this invocation of scp will be incorrect:
> >
> > 7ab0  20 2d 66 20 c3 90 c2 a0  c3 90 c2 b0 c3 91 c2 81  | -f 
> > |
> > 7ac0  c3 90 c2 bf c3 90 c2 b8  c3 91 c2 81 c3 90 c2 b0  
> > ||
> > 7ad0  c3 90 c2 bd c3 90 c2 b8  c3 90 c2 b5 5f c3 90 c2  
> > |_...|
> > 7ae0  a2 c3 90 c2 97 c3 90 c2  98 2e 70 64 66 0a|..pdf. 
> >  |
> >
> > Correct invocation must be:
> >
> >    20 2d 66 20  | 
> > -f |
> > 0010  d0 a0 d0 b0 d1 81 d0 bf  d0 b8 d1 81 d0 b0 d0 bd  
> > ||
> > 0020  d0 b8 d0 b5 5f d0 a2 d0  97 d0 98 2e 70 64 66 0a  
> > |_...pdf.|
> >
> > `d0` =>  `c3 90`
> > `a0` =>  `c2 a0`
> >
> > I.e. every byte re-encoded to utf-8: `d0` =>  `c3 90`
> >
> > As result imposible to access files w/ non-ascii names.
> 
> This might be related to PR213013.  Could you please try on head after 
> r306782 ?

I think not related. PR213013 is about character classification, my
report is about unnecessary encoding shell arguments.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

tcsh is not handled correctly UTF-8 in arguments

2016-10-19 Thread Slawa Olhovchenkov

tcsh called by sshd for invocation of scp: `tcsh -c scp -f Расписание.pdf`
At this time no any LC_* is set.
tcsh read .cshrc and set LC_CTYPE=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8.
After this invocation of scp will be incorrect:

7ab0  20 2d 66 20 c3 90 c2 a0  c3 90 c2 b0 c3 91 c2 81  | -f |
7ac0  c3 90 c2 bf c3 90 c2 b8  c3 91 c2 81 c3 90 c2 b0  ||
7ad0  c3 90 c2 bd c3 90 c2 b8  c3 90 c2 b5 5f c3 90 c2  |_...|
7ae0  a2 c3 90 c2 97 c3 90 c2  98 2e 70 64 66 0a|..pdf.  |

Correct invocation must be:

   20 2d 66 20  | -f |
0010  d0 a0 d0 b0 d1 81 d0 bf  d0 b8 d1 81 d0 b0 d0 bd  ||
0020  d0 b8 d0 b5 5f d0 a2 d0  97 d0 98 2e 70 64 66 0a  |_...pdf.|

`d0` =>  `c3 90`
`a0` =>  `c2 a0`

I.e. every byte re-encoded to utf-8: `d0` =>  `c3 90`

As result imposible to access files w/ non-ascii names.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

stable/11: lock contention on zone_fetch_slab

2016-10-16 Thread Slawa Olhovchenkov

@ CPU_CLK_UNHALTED_CORE [271718 samples]

22.48%  [61081]lock_delay @ /boot/kernel/kernel
 99.72%  [60908] __mtx_lock_sleep
  67.69%  [41230]  zone_fetch_slab
   100.0%  [41230]   zone_import
100.0%  [41230]zone_alloc_item
 99.99%  [41226] uma_zalloc_arg
  74.55%  [30732]  m_getm2
   100.0%  [30732]   m_uiotombuf
100.0%  [30732]sosend_generic
 99.93%  [30711] soo_write
  100.0%  [30711]  dofilewrite
   100.0%  [30711]   kern_writev
99.37%  [30518]sys_writev
 100.0%  [30518] amd64_syscall
00.63%  [193]  sys_write
 100.0%  [193]   amd64_syscall
 00.07%  [21]kern_sendit
  100.0%  [21] sendit
   100.0%  [21]  sys_sendto
100.0%  [21]   amd64_syscall
  20.84%  [8590]   m_copym
   100.0%  [8590]tcp_output
96.18%  [8262] tcp_usr_send
 100.0%  [8262]  sosend_generic
  99.67%  [8235]   soo_write
   100.0%  [8235]dofilewrite
100.0%  [8235] kern_writev
 99.38%  [8184]  sys_writev
  100.0%  [8184]   amd64_syscall
 00.62%  [51]sys_write
  100.0%  [51] amd64_syscall
  00.33%  [27] kern_sendit
   100.0%  [27]  sendit
100.0%  [27]   sys_sendto
 100.0%  [27]amd64_syscall
03.82%  [328]  tcp_timer_rexmt
 100.0%  [328]   softclock_call_cc
  100.0%  [328]softclock
   100.0%  [328] intr_event_execute_handlers
100.0%  [328]  ithread_loop
 100.0%  [328]   fork_exit
  04.55%  [1874]   tcp_output
   75.72%  [1419]tcp_usr_send
100.0%  [1419] sosend_generic
 98.38%  [1396]  soo_write
  100.0%  [1396]   dofilewrite
   100.0%  [1396]kern_writev
87.11%  [1216] sys_writev
 100.0%  [1216]  amd64_syscall
12.89%  [180]  sys_write
 100.0%  [180]   amd64_syscall
 01.62%  [23]kern_sendit
  100.0%  [23] sendit
   100.0%  [23]  sys_sendto
100.0%  [23]   amd64_syscall
   13.07%  [245] tcp_timer_rexmt
100.0%  [245]  softclock_call_cc
 100.0%  [245]   softclock
  100.0%  [245]intr_event_execute_handlers
   100.0%  [245] ithread_loop
100.0%  [245]  fork_exit
   06.46%  [121] tcp_timer_delack
100.0%  [121]  softclock_call_cc
 100.0%  [121]   softclock
  100.0%  [121]intr_event_execute_handlers
   100.0%  [121] ithread_loop
100.0%  [121]  fork_exit
   02.99%  [56]  tcp_do_segment
100.0%  [56]   tcp_input
 100.0%  [56]ip_input
  100.0%  [56] swi_net
   100.0%  [56]  intr_event_execute_handlers
100.0%  [56]   ithread_loop
 100.0%  [56]fork_exit
   01.55%  [29]  tcp_usr_disconnect
100.0%  [29]   soclose
 100.0%  [29]_fdrop
  100.0%  [29] closef
   100.0%  [29]  closefp
100.0%  [29]   amd64_syscall
   00.11%  [2]   tcp_timer_persist
100.0%  [2]softclock_call_cc
 100.0%  [2] softclock
  100.0%  [2]  intr_event_execute_handlers
   100.0%  [2]   ithread_loop
100.0%  [2]fork_exit
   00.11%  [2]   tcp_drop
100.0%  [2]tcp_timer_rexmt
 100.0%  [2] softclock_call_cc
  100.0%  [2]  softclock
   100.0%  [2]   intr_event_execute_handlers
100.0%  [2]ithread_loop
 100.0%  [2] fork_exit
  00.07%  [28] syncache_respond
   100.0%  [28]  syncache_timer
100.0%  [28]

Re: 11.0 stuck on high network load

2016-10-14 Thread Slawa Olhovchenkov

On Fri, Oct 14, 2016 at 11:48:38AM +0200, Julien Charbon wrote:

> >>> Also, using dtrace too complex in production (need complex startup
> >>> under screen and capture output) and for many peoples.
> >>> kdb_backtrace() have too less administrative overhead.
> >>
> >>  I still think it is overkill.  The main goal of this change is to fix a
> >> quite tricky and old TCP stack locking issue.  Let's try to do that
> >> first, it is complex enough by itself.
> >>
> >>  Once the fix is validated and pushed, feel free to propose your own
> >> patch/review to add kdb_backtrace(), log(), etc.. to get other devs
> >> point of view.
> >>
> >>  I don't remember who said: "Never ever optimize error cases"...
> > 
> > This is not optimeze error cases, this is error recovery and
> > diagnostic of error cases in other subsystems.
> 
>  Sure, I guess this quote is more geared toward:  "Always spend 50x more
> time on improving the main path than the error path".
> 
> > Currently FreeBSD internals too complex for just always trust on
> > correct of other subsystem or do panic on any incosystency.
> > 
> > INVARIANTS too expensive now (20Gbit drops to 8Gbits).
> 
>  I do agree.  I am not expert enough to see all the side effects of
> calling kdb_backtrace() from the TCP stack, might be way too slow,
> tricky in interruption context, etc.  You can see that  kdb_backtrace()

I think about this. This is example take from netgraph and this
similar case (about interruption context and etc). Occurrence to rare
(one per day, may be one per two hour) for any overhead.
OK, I am see you point: you expirence don't allow to put this code and
need separete review and commit. Right, np.

> is rarely called in the kernel source.  That's why it is better if you
> propose a review on adding this line to get comments from other devs on
> just this question.
> 
> > PS: I am applay patch. Wait till monday.
> > 
> > Thanks very match for this hard work!
> 
>  No problem, thanks for your time.  But it is not over yet:  We have to
> wait for final test.

Currently system don't use Chelsio TOE, after monday I am update
system with Chelsio TOE. With chelsio I am see this occurrence very
rare, one in few month.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-14 Thread Slawa Olhovchenkov

On Thu, Oct 13, 2016 at 06:14:29PM +0200, Julien Charbon wrote:

> On 10/13/16 5:17 PM, Slawa Olhovchenkov wrote:
> > On Thu, Oct 13, 2016 at 05:06:00PM +0200, Julien Charbon wrote:
> > 
> >>>> will give you that trace in the core, and without INVARIANT then it is
> >>>> better to use dtrace:
> >>>>
> >>>> $ cat tcp-twstart-dropped.d
> >>>> fbt::tcp_twstart:entry
> >>>> /args[0]->t_inpcb->inp_flags & 0x0400/
> >>>> {
> >>>>   stack();
> >>>>   printf("INP_DROPPED in tcp_twstart: %x", args[0]->t_inpcb->inp_flags);
> >>>> }
> >>>
> >>> Same code may be insert there too, IMHO.
> >>
> >>  Hmm, I don't think so:
> >>
> >>  - If you have INVARIANT, the kernel will panic in tcp_twstart() or
> >> tcp_detach() and you will have everything you need to debug.
> >>  - If you don't, dtrace is the right tool to use in all cases anyway.
> > 
> > dtrace don't executed in may case w/ diagnostic "dtrace: processing
> > aborted: Abort due to systemic unresponsiveness". This is for
> > tcp_close. May be tcp_twstart will be more successuful, may be not.
> 
>  It does and will.
> 
> > Also, using dtrace too complex in production (need complex startup
> > under screen and capture output) and for many peoples.
> > kdb_backtrace() have too less administrative overhead.
> 
>  I still think it is overkill.  The main goal of this change is to fix a
> quite tricky and old TCP stack locking issue.  Let's try to do that
> first, it is complex enough by itself.
> 
>  Once the fix is validated and pushed, feel free to propose your own
> patch/review to add kdb_backtrace(), log(), etc.. to get other devs
> point of view.
> 
>  I don't remember who said: "Never ever optimize error cases"...

This is not optimeze error cases, this is error recovery and
diagnostic of error cases in other subsystems.

Currently FreeBSD internals too complex for just always trust on
correct of other subsystem or do panic on any incosystency.

INVARIANTS too expensive now (20Gbit drops to 8Gbits).

PS: I am applay patch. Wait till monday.

Thanks very match for this hard work!
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-13 Thread Slawa Olhovchenkov

On Thu, Oct 13, 2016 at 05:06:00PM +0200, Julien Charbon wrote:

> >> will give you that trace in the core, and without INVARIANT then it is
> >> better to use dtrace:
> >>
> >> $ cat tcp-twstart-dropped.d
> >> fbt::tcp_twstart:entry
> >> /args[0]->t_inpcb->inp_flags & 0x0400/
> >> {
> >>   stack();
> >>   printf("INP_DROPPED in tcp_twstart: %x", args[0]->t_inpcb->inp_flags);
> >> }
> > 
> > Same code may be insert there too, IMHO.
> 
>  Hmm, I don't think so:
> 
>  - If you have INVARIANT, the kernel will panic in tcp_twstart() or
> tcp_detach() and you will have everything you need to debug.
>  - If you don't, dtrace is the right tool to use in all cases anyway.

dtrace don't executed in may case w/ diagnostic "dtrace: processing
aborted: Abort due to systemic unresponsiveness". This is for
tcp_close. May be tcp_twstart will be more successuful, may be not.
Also, using dtrace too complex in production (need complex startup
under screen and capture output) and for many peoples.
kdb_backtrace() have too less administrative overhead.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-13 Thread Slawa Olhovchenkov

On Thu, Oct 13, 2016 at 01:56:21PM +0200, Julien Charbon wrote:

> >>  Something like:
> > 
> > Yes, thanks!
> 
>  Proposed changes added in the review:
> 
> https://reviews.freebsd.org/D8211
> 
>  tell me when you have three days without issue with this change.
> 
> >> tcp_detach() {
> >>
> >>   ...
> >>   if (inp->inp_flags & INP_TIMEWAIT) {
> >>
> >> ...
> >> if (inp->inp_flags & INP_DROPPED) {
> >>
> >>   in_pcbdetach(inp);
> >>   if (__predict_true(tp == NULL)) {
> >>   in_pcbfree(inp);
> >>   } else {
> >> #ifdef INVARIANTS
> >> panic("tcp_detach: tp != NULL, That's not good because 'blah'\n");
> >> #else
> >> log(LOG_ERR, "tcp_detach: tp != NULL, That's not good because
> >> 'blah'\n");
> > 
> > May be some more info in log can help to detect root cause of issuse?
> > I am don't know what info, may be flags or number of references?
> 
>  For this kind of issue, the useful part is the stacktrace.  INVARIANT

Like this?

#ifdef KDB
kdb_backtrace();
#endif

as found in sys/netgraph/ng_base.c

> will give you that trace in the core, and without INVARIANT then it is
> better to use dtrace:
> 
> $ cat tcp-twstart-dropped.d
> fbt::tcp_twstart:entry
> /args[0]->t_inpcb->inp_flags & 0x0400/
> {
>   stack();
>   printf("INP_DROPPED in tcp_twstart: %x", args[0]->t_inpcb->inp_flags);
> }

Same code may be insert there too, IMHO.

> --
> Julien
> 
> 



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-12 Thread Slawa Olhovchenkov

On Wed, Oct 12, 2016 at 05:17:35PM +0200, Julien Charbon wrote:

>   I see, thus just for the context:  The TCP stack in sys/dev/cxgb* 
>  is a
>  TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a
>  separate/side TCP stack that is used only with TCP_OFFLOAD option.
> 
>   This TOE TCP stack actually has its own set of detach()/input()
>  functions and seems to check INP_DROPPED flag properly.  I guess @np
>  check fixes in socket TCP stack and decides which one can also impact
>  the Chelsio TOE TCP stack.  Some bugs are only in socket TCP stack, 
>  some
>  are only in TOE TCP stack.
> >>>
> >>> I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE
> >>> TCP stack and impact this to
> >>> tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path.
> >>
> >>  I see, I expect no problem on this side as tcp_timer_2msl() checks the
> >> INP_TIMEWAIT flag and do not call tcp_close() if set.
> >
> > I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl()
> > don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK()
> > and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set
> > INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected
> > INP_TIMEWAIT in tcp_usr_detach().
> 
>   Sure, basically the same bug that in classic TCP stack.  If you think
>  it can happen, send an email describing that to np@ and he will check
>  and fix that.  He is a TOE TCP stack expert and I am not.  In all cases,
>  if this issue is possible in TOE TCP stack context, the patch will be
>  straightforward:  If the INP_DROPPED flag is set do not call 
>  tcp_twstart().

I am email to np@

>   The current patch focuses only on the classic TCP stack.
> >>>
> >>> May be current workaround (with logging) in tcp_usr_detach() is good
> >>> solutuion for preventing system lockout by similar bugs?
> >>
> >>  Good question, the quick workaround in tcp_usr_detach() does not handle
> >> all the cases.  If it reduces the number of crashes you can still find
> >> scenarios where it can have unexpected side effect.
> > 
> > This is best then guaranted lockout.
> > 
> >>  Long term solution is to enforce:  If the inp has the INP_DROPPED flag
> >> just stop processing it and return.  If you grep the INP_DROPPED flag in
> >> kernel sources, you can see that this test is already done in almost all
> >> tcp_*() processing functions but tcp_input().
> >>
> >>  I would say that even without this issue tcp_input() should check
> >> INP_DROPPED flags after INP_WLOCK anyway.  Same for the TOE TCP stack,
> >> you are simply not supposed to process a inp with INP_DROPPED flag.
> > 
> > Absolutly acceptant!
> > May point is: more check and good handling of check result is best for
> > stability.
> > 
> > I.e. AND check INP_DROPPED in tcp_input AND workaroud INP_TIMEWAIT in
> > tcp_usr_detach (with logging) and check of some posible cases in XXX TOE.
> > 
> > Current TCP stack too complex and have many corner cases. This is need
> > additional guards where posible (not caused kernel panic).
> 
>  I see your point:  Even if this issue is caught by this assert:
> 
> KASSERT(tp == NULL, ("tcp_detach: INP_TIMEWAIT && "
> "INP_DROPPED && tp != NULL"));
> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_usrreq.c#L213
> 
>  you might not have INVARIANT option, then you will get a lockout quite
> difficult to debug.  Thus what we can do is:
> 
>  - If INVARIANT is set:  kernel panic to get all the details in the core.
>  - If INVARIANT is not set:  Log this error with an explicit kernel
> log(LOG_ERR) describing the issue, and then use the workaround to avoid
> the double-free to let the system to good enough state.
> 
>  Something like:

Yes, thanks!

> tcp_detach() {
> 
>   ...
>   if (inp->inp_flags & INP_TIMEWAIT) {
> 
> ...
> if (inp->inp_flags & INP_DROPPED) {
> 
>   in_pcbdetach(inp);
>   if (__predict_true(tp == NULL)) {
>   in_pcbfree(inp);
>   } else {
> #ifdef INVARIANTS
> panic("tcp_detach: tp != NULL, That's not good because 'blah'\n");
> #else
> log(LOG_ERR, "tcp_detach: tp != NULL, That's not good because
> 'blah'\n");

May be some more info in log can help to detect root cause of issuse?
I am don't know what info, may be flags or number of references?

> #endif
> INP_WUNLOCK(inp);
>   }
> }
>   }
> 
> ...
> 
> }
> 
> --
> Julien
> 



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-12 Thread Slawa Olhovchenkov

On Wed, Oct 12, 2016 at 02:35:11PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 10/12/16 2:13 PM, Slawa Olhovchenkov wrote:
> > On Wed, Oct 12, 2016 at 02:06:59PM +0200, Julien Charbon wrote:
> >>>>>>> sofree() call tcp_usr_detach() and in tcp_usr_detach() we have
> >>>>>>> unexpected INP_TIMEWAIT.
> >>>>>>
> >>>>>>  I see, thus just for the context:  The TCP stack in sys/dev/cxgb* is a
> >>>>>> TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a
> >>>>>> separate/side TCP stack that is used only with TCP_OFFLOAD option.
> >>>>>>
> >>>>>>  This TOE TCP stack actually has its own set of detach()/input()
> >>>>>> functions and seems to check INP_DROPPED flag properly.  I guess @np
> >>>>>> check fixes in socket TCP stack and decides which one can also impact
> >>>>>> the Chelsio TOE TCP stack.  Some bugs are only in socket TCP stack, 
> >>>>>> some
> >>>>>> are only in TOE TCP stack.
> >>>>>
> >>>>> I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE
> >>>>> TCP stack and impact this to
> >>>>> tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path.
> >>>>
> >>>>  I see, I expect no problem on this side as tcp_timer_2msl() checks the
> >>>> INP_TIMEWAIT flag and do not call tcp_close() if set.
> >>>
> >>> I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl()
> >>> don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK()
> >>> and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set
> >>> INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected
> >>> INP_TIMEWAIT in tcp_usr_detach().
> >>
> >>  Sure, basically the same bug that in classic TCP stack.  If you think
> >> it can happen, send an email describing that to np@ and he will check
> >> and fix that.  He is a TOE TCP stack expert and I am not.  In all cases,
> >> if this issue is possible in TOE TCP stack context, the patch will be
> >> straightforward:  If the INP_DROPPED flag is set do not call tcp_twstart().
> >>
> >>  The current patch focuses only on the classic TCP stack.
> > 
> > May be current workaround (with logging) in tcp_usr_detach() is good
> > solutuion for preventing system lockout by similar bugs?
> 
>  Good question, the quick workaround in tcp_usr_detach() does not handle
> all the cases.  If it reduces the number of crashes you can still find
> scenarios where it can have unexpected side effect.

This is best then guaranted lockout.

>  Long term solution is to enforce:  If the inp has the INP_DROPPED flag
> just stop processing it and return.  If you grep the INP_DROPPED flag in
> kernel sources, you can see that this test is already done in almost all
> tcp_*() processing functions but tcp_input().
> 
>  I would say that even without this issue tcp_input() should check
> INP_DROPPED flags after INP_WLOCK anyway.  Same for the TOE TCP stack,
> you are simply not supposed to process a inp with INP_DROPPED flag.

Absolutly acceptant!
May point is: more check and good handling of check result is best for
stability.

I.e. AND check INP_DROPPED in tcp_input AND workaroud INP_TIMEWAIT in
tcp_usr_detach (with logging) and check of some posible cases in XXX TOE.

Current TCP stack too complex and have many corner cases. This is need
additional guards where posible (not caused kernel panic).
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-12 Thread Slawa Olhovchenkov

On Wed, Oct 12, 2016 at 02:06:59PM +0200, Julien Charbon wrote:

> > sofree() call tcp_usr_detach() and in tcp_usr_detach() we have
> > unexpected INP_TIMEWAIT.
> 
>   I see, thus just for the context:  The TCP stack in sys/dev/cxgb* is a
>  TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a
>  separate/side TCP stack that is used only with TCP_OFFLOAD option.
> 
>   This TOE TCP stack actually has its own set of detach()/input()
>  functions and seems to check INP_DROPPED flag properly.  I guess @np
>  check fixes in socket TCP stack and decides which one can also impact
>  the Chelsio TOE TCP stack.  Some bugs are only in socket TCP stack, some
>  are only in TOE TCP stack.
> >>>
> >>> I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE
> >>> TCP stack and impact this to
> >>> tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path.
> >>
> >>  I see, I expect no problem on this side as tcp_timer_2msl() checks the
> >> INP_TIMEWAIT flag and do not call tcp_close() if set.
> > 
> > I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl()
> > don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK()
> > and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set
> > INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected
> > INP_TIMEWAIT in tcp_usr_detach().
> 
>  Sure, basically the same bug that in classic TCP stack.  If you think
> it can happen, send an email describing that to np@ and he will check
> and fix that.  He is a TOE TCP stack expert and I am not.  In all cases,
> if this issue is possible in TOE TCP stack context, the patch will be
> straightforward:  If the INP_DROPPED flag is set do not call tcp_twstart().
> 
>  The current patch focuses only on the classic TCP stack.

May be current workaround (with logging) in tcp_usr_detach() is good
solutuion for preventing system lockout by similar bugs?


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-12 Thread Slawa Olhovchenkov

On Wed, Oct 12, 2016 at 11:42:38AM +0200, Julien Charbon wrote:

> On 10/12/16 11:29 AM, Slawa Olhovchenkov wrote:
> > On Wed, Oct 12, 2016 at 11:19:48AM +0200, Julien Charbon wrote:
> > 
> >>> if INP_WLOCK is like spinlock -- this is dead lock.
> >>> if INP_WLOCK is like mutex -- thread1 resheduled.
> >>
> >>  Thanks, I understand you question now.  No an interrupt cannot bypass a
> >> lock:  Here INP_WLOCK is like mutex -- thread1 resheduled.
> > 
> > Thanks, nice.
> > 
> >>>>> As I remeber race created by call tcp_twstart() at time of end
> >>>>> tcp_close(), at path sofree()-tcp_usr_detach() and unexpected
> >>>>> INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in 
> >>>>> tcp_twstart()
> >>>>
> >>>>  Exactly, thus the current fix is:  If you already have the INP_DROPPED
> >>>> flag set you are not allowed to call tcp_twstart(), actually it is a
> >>>> good candidate for a new INVARIANT.  Let me add that.
> >>>>
> >>>>> After check source code I am found invocation of tcp_twstart() in
> >>>>> sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c,
> >>>>> sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c.
> >>>>>
> >>>>> Invocation from sys/netinet/tcp_stacks/fastpath.c and
> >>>>> sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now
> >>>>> will be OK.
> >>>>>
> >>>>> Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and
> >>>>> sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed
> >>>>> INP_WLOCK. Is this OK?
> >>>>>
> >>>>> Can be thread A wants do_peer_close() directed from chelsio IRQ
> >>>>> handler, bypass tcp_input()?
> >>>>
> >>>>  If you look carefully INP_WLOCK is used in cxgb_cpl_io.c and
> >>>> t4_cpl_io.c before calling tcp_twstart().
> >>>
> >>> Yes, and you remeber: sys/netinet/tcp_subr.c
> >>>
> >>>   1535  struct tcpcb *
> >>>   1536  tcp_close(struct tcpcb *tp)
> >>>   1537  {
> >>> ...
> >>>   1569  INP_WUNLOCK(inp);
> >>>   1570  ACCEPT_LOCK();
> >>>   1571  SOCK_LOCK(so);
> >>>   1572  so->so_state &= ~SS_PROTOREF;
> >>>   1573  sofree(so);
> >>>   1574  return (NULL);
> >>>
> >>> sofree() call tcp_usr_detach() and in tcp_usr_detach() we have
> >>> unexpected INP_TIMEWAIT.
> >>
> >>  I see, thus just for the context:  The TCP stack in sys/dev/cxgb* is a
> >> TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a
> >> separate/side TCP stack that is used only with TCP_OFFLOAD option.
> >>
> >>  This TOE TCP stack actually has its own set of detach()/input()
> >> functions and seems to check INP_DROPPED flag properly.  I guess @np
> >> check fixes in socket TCP stack and decides which one can also impact
> >> the Chelsio TOE TCP stack.  Some bugs are only in socket TCP stack, some
> >> are only in TOE TCP stack.
> > 
> > I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE
> > TCP stack and impact this to
> > tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path.
> 
>  I see, I expect no problem on this side as tcp_timer_2msl() checks the
> INP_TIMEWAIT flag and do not call tcp_close() if set.

I am about case when at time of first INP_WUNLOCK() tcp_timer_2msl()
don't see INP_TIMEWAIT, call tcp_close(), tcp_close() do INP_WUNLOCK()
and now Chelsio TOE take INP_WLOCK, do tcp_twstart() and set
INP_TIMEWAIT. After this tcp_timer_2msl resume and have unexpected
INP_TIMEWAIT in tcp_usr_detach().

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-12 Thread Slawa Olhovchenkov

On Wed, Oct 12, 2016 at 11:19:48AM +0200, Julien Charbon wrote:

> > if INP_WLOCK is like spinlock -- this is dead lock.
> > if INP_WLOCK is like mutex -- thread1 resheduled.
> 
>  Thanks, I understand you question now.  No an interrupt cannot bypass a
> lock:  Here INP_WLOCK is like mutex -- thread1 resheduled.

Thanks, nice.

> >>> As I remeber race created by call tcp_twstart() at time of end
> >>> tcp_close(), at path sofree()-tcp_usr_detach() and unexpected
> >>> INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in 
> >>> tcp_twstart()
> >>
> >>  Exactly, thus the current fix is:  If you already have the INP_DROPPED
> >> flag set you are not allowed to call tcp_twstart(), actually it is a
> >> good candidate for a new INVARIANT.  Let me add that.
> >>
> >>> After check source code I am found invocation of tcp_twstart() in
> >>> sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c,
> >>> sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c.
> >>>
> >>> Invocation from sys/netinet/tcp_stacks/fastpath.c and
> >>> sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now
> >>> will be OK.
> >>>
> >>> Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and
> >>> sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed
> >>> INP_WLOCK. Is this OK?
> >>>
> >>> Can be thread A wants do_peer_close() directed from chelsio IRQ
> >>> handler, bypass tcp_input()?
> >>
> >>  If you look carefully INP_WLOCK is used in cxgb_cpl_io.c and
> >> t4_cpl_io.c before calling tcp_twstart().
> > 
> > Yes, and you remeber: sys/netinet/tcp_subr.c
> > 
> >   1535  struct tcpcb *
> >   1536  tcp_close(struct tcpcb *tp)
> >   1537  {
> > ...
> >   1569  INP_WUNLOCK(inp);
> >   1570  ACCEPT_LOCK();
> >   1571  SOCK_LOCK(so);
> >   1572  so->so_state &= ~SS_PROTOREF;
> >   1573  sofree(so);
> >   1574  return (NULL);
> > 
> > sofree() call tcp_usr_detach() and in tcp_usr_detach() we have
> > unexpected INP_TIMEWAIT.
> 
>  I see, thus just for the context:  The TCP stack in sys/dev/cxgb* is a
> TOE (TCP Offload Engine?) TCP stack for Chelsio NICs, it is a
> separate/side TCP stack that is used only with TCP_OFFLOAD option.
> 
>  This TOE TCP stack actually has its own set of detach()/input()
> functions and seems to check INP_DROPPED flag properly.  I guess @np
> check fixes in socket TCP stack and decides which one can also impact
> the Chelsio TOE TCP stack.  Some bugs are only in socket TCP stack, some
> are only in TOE TCP stack.

I am fear about other direction -- setting INP_TIMEWAIT in Chelsio TOE
TCP stack and impact this to
tcp_timer_2msl()/tcp_close()/sofree()/tcp_usr_detach() path.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-12 Thread Slawa Olhovchenkov

On Wed, Oct 12, 2016 at 10:18:18AM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 10/11/16 2:11 PM, Slawa Olhovchenkov wrote:
> > On Tue, Oct 11, 2016 at 09:20:17AM +0200, Julien Charbon wrote:
> >>  Then threads are competing for the INP_WLOCK lock.  For the example,
> >> let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and
> >> racing for this INP_WLOCK:
> >>
> >> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1964
> >>
> >>  And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and
> >> racing for this INP_WLOCK:
> >>
> >> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_timer.c#L323
> >>
> >>  That leads to two cases:
> >>
> >>  o Thread A wins the race:
> >>
> >>   Thread A will continue tcp_input() as usal and INP_DROPPED flags is
> >> not set and inp is still in TCP hash table.
> >>   Thread B is waiting on thread A to release INP_WLOCK after finishing
> >> tcp_input() processing, and thread B will continue
> >> tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing.
> >>
> >>  o Thread B wins the race:
> >>
> >>   Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp
> >> INP_DROPPED is set and inp being removed from TCP hash table.
> >>   In parallel, thread A has found the inp in TCP hash before is was
> >> removed, and waiting on the found inp INP_WLOCK lock.
> >>   Once thread B has released the INP_WLOCK lock, thread A gets this lock
> >> and sees the INP_DROPPED flag and do "goto findpcb" but here because the
> >> inp is not more in TCP hash table and it will not be find again by
> >> in_pcblookup_mbuf().
> >>
> >>  Hopefully I am clear enough here.
> > 
> > Thanks, very clear.
> > Small qeustion: when both thread run on same CPU core, INP_WLOCK will
> > be re-schedule?
> 
>  Hmm, a thread can re-scheduled but not a lock. Thus no sure I
> understand your question here.  :)

I am don't know how work INP_WLOCK in this case (all on same cpu):

thread1: INP_WLOCK
-interrupt--
thread2: INP_WLOCK

if INP_WLOCK is like spinlock -- this is dead lock.
if INP_WLOCK is like mutex -- thread1 resheduled.

> > As I remeber race created by call tcp_twstart() at time of end
> > tcp_close(), at path sofree()-tcp_usr_detach() and unexpected
> > INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in 
> > tcp_twstart()
> 
>  Exactly, thus the current fix is:  If you already have the INP_DROPPED
> flag set you are not allowed to call tcp_twstart(), actually it is a
> good candidate for a new INVARIANT.  Let me add that.
> 
> > After check source code I am found invocation of tcp_twstart() in
> > sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c,
> > sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c.
> > 
> > Invocation from sys/netinet/tcp_stacks/fastpath.c and
> > sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now
> > will be OK.
> > 
> > Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and
> > sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed
> > INP_WLOCK. Is this OK?
> > 
> > Can be thread A wants do_peer_close() directed from chelsio IRQ
> > handler, bypass tcp_input()?
> 
>  If you look carefully INP_WLOCK is used in cxgb_cpl_io.c and
> t4_cpl_io.c before calling tcp_twstart().

Yes, and you remeber: sys/netinet/tcp_subr.c

  1535  struct tcpcb *
  1536  tcp_close(struct tcpcb *tp)
  1537  {
...
  1569  INP_WUNLOCK(inp);
  1570  ACCEPT_LOCK();
  1571  SOCK_LOCK(so);
  1572  so->so_state &= ~SS_PROTOREF;
  1573  sofree(so);
  1574  return (NULL);

sofree() call tcp_usr_detach() and in tcp_usr_detach() we have
unexpected INP_TIMEWAIT.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-11 Thread Slawa Olhovchenkov

On Tue, Oct 11, 2016 at 09:20:17AM +0200, Julien Charbon wrote:

>  Then threads are competing for the INP_WLOCK lock.  For the example,
> let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and
> racing for this INP_WLOCK:
> 
> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1964
> 
>  And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and
> racing for this INP_WLOCK:
> 
> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_timer.c#L323
> 
>  That leads to two cases:
> 
>  o Thread A wins the race:
> 
>   Thread A will continue tcp_input() as usal and INP_DROPPED flags is
> not set and inp is still in TCP hash table.
>   Thread B is waiting on thread A to release INP_WLOCK after finishing
> tcp_input() processing, and thread B will continue
> tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing.
> 
>  o Thread B wins the race:
> 
>   Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp
> INP_DROPPED is set and inp being removed from TCP hash table.
>   In parallel, thread A has found the inp in TCP hash before is was
> removed, and waiting on the found inp INP_WLOCK lock.
>   Once thread B has released the INP_WLOCK lock, thread A gets this lock
> and sees the INP_DROPPED flag and do "goto findpcb" but here because the
> inp is not more in TCP hash table and it will not be find again by
> in_pcblookup_mbuf().
> 
>  Hopefully I am clear enough here.

Thanks, very clear.
Small qeustion: when both thread run on same CPU core, INP_WLOCK will
be re-schedule?

As I remeber race created by call tcp_twstart() at time of end
tcp_close(), at path sofree()-tcp_usr_detach() and unexpected
INP_TIMEWAIT state in the tcp_usr_detach(). INP_TIMEWAIT set in tcp_twstart()

After check source code I am found invocation of tcp_twstart() in
sys/netinet/tcp_stacks/fastpath.c, sys/netinet/tcp_input.c,
sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c, sys/dev/cxgbe/tom/t4_cpl_io.c.

Invocation from sys/netinet/tcp_stacks/fastpath.c and
sys/netinet/tcp_input.c guarded by INP_WLOCK in tcp_input(), and now
will be OK.

Invocation from sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c and
sys/dev/cxgbe/tom/t4_cpl_io.c is not clear to me, I am see independed
INP_WLOCK. Is this OK?

Can be thread A wants do_peer_close() directed from chelsio IRQ
handler, bypass tcp_input()?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-10 Thread Slawa Olhovchenkov

On Mon, Oct 10, 2016 at 05:44:21PM +0200, Julien Charbon wrote:

> >> can check the current other usages of goto findpcb in tcp_input().  The
> >> rational here being:
> >>
> >>  - Behavior before the patch:  If the inp we found was deleted then goto
> >> findpcb.
> >>  - Behavior after the patch:  If the inp we found was deleted or dropped
> >> then goto findpcb.
> >>
> >>  I just prefer having the same behavior applied everywhere:  If
> >> tcp_input() loses the inp lock race and the inp was deleted or dropped
> >> then retry to find a new inpcb to deliver to.
> >>
> >>  But you are right dropping the packet here will also fix the issue.
> >>
> >>  Then the review process becomes quite helpful because people can argue:
> >>  Dropping here is better because "blah", or goto findpcb is better
> >> because "bluh", etc.  And at the review end you have a nice final patch.
> >>
> >> https://reviews.freebsd.org/D8211
> >
> > I am not sure, I am see to
> >
> > sys/netinet/in_pcb.h:#defineINP_DROPPED 0x0400 /*
> protocol drop flag */
> >
> > and think this is a flag 'all packets must be droped'
> 
>  Hm, I believe this flag means "this inp has been dropped by the TCP
> stack, so don't use it anymore".  Actually this flag is better described
> in the function that sets it:
> 
> "(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid
> future packet delivery or event notification when a socket remains open
> but TCP has closed."
> 
> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1320
> 
> /*
>  * in_pcbdrop() removes an inpcb from hashed lists, releasing its
> address and
>  * port reservation, and preventing it from being returned by inpcb lookups.
>  *
>  * It is used by TCP to mark an inpcb as unused and avoid future packet
>  * delivery or event notification when a socket remains open but TCP has
>  * closed.  This might occur as a result of a shutdown()-initiated TCP close
>  * or a RST on the wire, and allows the port binding to be reused while
> still
>  * maintaining the invariant that so_pcb always points to a valid inpcb
> until
>  * in_pcbdetach().
>  *
>  */
> void
> in_pcbdrop(struct inpcb *inp)
> {
>   inp->inp_flags |= INP_DROPPED;
>   ...
> 
>  The classical example where "goto findpcb" is useful:  You receive a
> new connection request with a TCP SYN packet and this packet is unlucky
> and reached a inp being dropped:
> 
>  - with "goto findpcb" approach, the next lookup will most likely find
> the LISTEN inp and start the TCP hand-shake as usual
>  - with "drop the packet" approach, the TCP client will need to
> re-transmit a TCP SYN packet
> 
>  It is not because a packet was unlucky once that it deserves to be
> dropped. :)

Thanks for explaining, very helpfull.
In this situation (TCP SYN with same 4-tuple as existing socket)
allocate new PCB is best. But for this we must destroy current PCB. I
am think INP_WUNLOCK(inp) don't destroy it and in_pcblookup_mbuf find
it again (I am think in_pcblookup_mbuf find this PCB on first turn).
I am assume for classical example in_pcbrele_wlocked(inp) free and
destroy current PCB for possibility in_pcblookup_mbuf allocate new
one.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-10 Thread Slawa Olhovchenkov

On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote:
> > On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:
> >> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
> >>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
> >>>
> >>>> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED flag, the
> >>>> process continues and calls INP_WUNLOCK() here:
> >>>>
> >>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568
> >>>
> >>> Look also to sys/netinet/tcp_timewait.c:488
> >>>
> >>> And check other locks from r160549
> >>
> >>  You are right, and here the a fix proposal for this issue:
> >>
> >> Fix a double-free when an inp transitions to INP_TIMEWAIT state after
> >> having been dropped
> >> https://reviews.freebsd.org/D8211
> >>
> >>  It basically enforces in_pcbdrop() logic in tcp_input():  A INP_DROPPED
> >> inpcb should never be proceed further.
> >>
> >>  Slawa, as you are the only one to reproduce this issue currently, could
> >> test this patch?  (And remove the temporary patch I did provided to you
> >> before).
> >>
> >>  I will wait for your tests results before pushing further.
> >>
> >>  Thanks!
> >>
> >> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
> >> index c72f01f..37f27e0 100644
> >> --- a/sys/netinet/tcp_input.c
> >> +++ b/sys/netinet/tcp_input.c
> >> @@ -921,6 +921,16 @@ findpcb:
> >> goto dropwithreset;
> >> }
> >> INP_WLOCK_ASSERT(inp);
> >> +   /*
> >> +* While waiting for inp lock during the lookup, another thread
> >> +* can have droppedt  the inpcb, in which case we need to loop back
> >> +* and try to find a new inpcb to deliver to.
> >> +*/
> >> +   if (inp->inp_flags & INP_DROPPED) {
> >> +   INP_WUNLOCK(inp);
> >> +   inp = NULL;
> >> +   goto findpcb;
> > 
> > Are you sure about this goto?
> > Can this cause infinite loop by found same inpcb?
> > May be drop packet is more correct?
> 
>  Good question:  Infinite loop is not possible here, as the next TCP
> hash lookup will return NULL or a fresh new and not dropped inp.  You

I am not expert in this api and don't see cause of this: I am assume
hash lookup don't remove from hash returned args and I am don't see
any removing of this inp. Why hash lookup don't return same inp?

(assume this input patch interrupt callout code on the same CPU core).

> can check the current other usages of goto findpcb in tcp_input().  The
> rational here being:
> 
>  - Behavior before the patch:  If the inp we found was deleted then goto
> findpcb.
>  - Behavior after the patch:  If the inp we found was deleted or dropped
> then goto findpcb.
> 
>  I just prefer having the same behavior applied everywhere:  If
> tcp_input() loses the inp lock race and the inp was deleted or dropped
> then retry to find a new inpcb to deliver to.
> 
>  But you are right dropping the packet here will also fix the issue.
> 
>  Then the review process becomes quite helpful because people can argue:
>  Dropping here is better because "blah", or goto findpcb is better
> because "bluh", etc.  And at the review end you have a nice final patch.
> 
> https://reviews.freebsd.org/D8211

I am not sure, I am see to

sys/netinet/in_pcb.h:#defineINP_DROPPED 0x0400 /* protocol 
drop flag */

and think this is a flag 'all packets must be droped'

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-10 Thread Slawa Olhovchenkov

On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:

> 
>  Hi,
> 
> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
> > On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
> > 
> >> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED flag, the
> >> process continues and calls INP_WUNLOCK() here:
> >>
> >> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568
> > 
> > Look also to sys/netinet/tcp_timewait.c:488
> > 
> > And check other locks from r160549
> 
>  You are right, and here the a fix proposal for this issue:
> 
> Fix a double-free when an inp transitions to INP_TIMEWAIT state after
> having been dropped
> https://reviews.freebsd.org/D8211
> 
>  It basically enforces in_pcbdrop() logic in tcp_input():  A INP_DROPPED
> inpcb should never be proceed further.
> 
>  Slawa, as you are the only one to reproduce this issue currently, could
> test this patch?  (And remove the temporary patch I did provided to you
> before).
> 
>  I will wait for your tests results before pushing further.
> 
>  Thanks!
> 
> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
> index c72f01f..37f27e0 100644
> --- a/sys/netinet/tcp_input.c
> +++ b/sys/netinet/tcp_input.c
> @@ -921,6 +921,16 @@ findpcb:
> goto dropwithreset;
> }
> INP_WLOCK_ASSERT(inp);
> +   /*
> +* While waiting for inp lock during the lookup, another thread
> +* can have droppedt  the inpcb, in which case we need to loop back
> +* and try to find a new inpcb to deliver to.
> +*/
> +   if (inp->inp_flags & INP_DROPPED) {
> +   INP_WUNLOCK(inp);
> +   inp = NULL;
> +   goto findpcb;

Are you sure about this goto?
Can this cause infinite loop by found same inpcb?
May be drop packet is more correct?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-10 Thread Slawa Olhovchenkov

On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote:

> 
>  Hi,
> 
> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote:
> > On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:
> > 
> >> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED flag, the
> >> process continues and calls INP_WUNLOCK() here:
> >>
> >> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568
> > 
> > Look also to sys/netinet/tcp_timewait.c:488
> > 
> > And check other locks from r160549
> 
>  You are right, and here the a fix proposal for this issue:
> 
> Fix a double-free when an inp transitions to INP_TIMEWAIT state after
> having been dropped
> https://reviews.freebsd.org/D8211
> 
>  It basically enforces in_pcbdrop() logic in tcp_input():  A INP_DROPPED
> inpcb should never be proceed further.
> 
>  Slawa, as you are the only one to reproduce this issue currently, could
> test this patch?  (And remove the temporary patch I did provided to you
> before).
> 
>  I will wait for your tests results before pushing further.

OK, I am will try it tomorrow
Thanks!

>  Thanks!
> 
> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c
> index c72f01f..37f27e0 100644
> --- a/sys/netinet/tcp_input.c
> +++ b/sys/netinet/tcp_input.c
> @@ -921,6 +921,16 @@ findpcb:
> goto dropwithreset;
> }
> INP_WLOCK_ASSERT(inp);
> +   /*
> +* While waiting for inp lock during the lookup, another thread
> +* can have droppedt  the inpcb, in which case we need to loop back
> +* and try to find a new inpcb to deliver to.
> +*/
> +   if (inp->inp_flags & INP_DROPPED) {
> +   INP_WUNLOCK(inp);
> +   inp = NULL;
> +   goto findpcb;
> +   }
> if ((inp->inp_flowtype == M_HASHTYPE_NONE) &&
> (M_HASHTYPE_GET(m) != M_HASHTYPE_NONE) &&
> ((inp->inp_socket == NULL) ||
> @@ -981,6 +991,10 @@ relocked:
> if (in_pcbrele_wlocked(inp)) {
> inp = NULL;
> goto findpcb;
> +   } else if (inp->inp_flags & INP_DROPPED) {
> +   INP_WUNLOCK(inp);
> +   inp = NULL;
> +   goto findpcb;
> }
> } else
> ti_locked = TI_RLOCKED;
> @@ -1040,6 +1054,10 @@ relocked:
> if (in_pcbrele_wlocked(inp)) {
> inp = NULL;
> goto findpcb;
> +   } else if (inp->inp_flags & INP_DROPPED) {
> +   INP_WUNLOCK(inp);
> +   inp = NULL;
> +   goto findpcb;
> }
> goto relocked;
> } else
> 
> --
> Julien
> 



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

11.0-RELEASE and mbuf-related trace

2016-10-07 Thread Slawa Olhovchenkov

Has anybody comment on this?
During debug tcp-related freeze I am collect starnge mbuf-related
freeze (this is like recursive lock to UMA Slabs keg) and trace:

last pid: 49575;  load averages:  2.00,  2.05,  3.75up 1+01:12:08  22:13:42
853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock
CPU 0:   0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 1:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 4:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 5:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 6:   0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 7:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 8:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 9:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 10:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 11:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 8659M Active, 8385M Inact, 107G Wired, 1325M Free
ARC: 99G Total, 88G MFU, 10G MRU, 32K Anon, 167M Header, 529M Other
Swap: 32G Total, 32G Free

# procstat -k -k 1046
  PIDTID COMM TDNAME   KSTACK
 1046 100686 nginx-mi_switch+0xd2 
critical_exit+0x7e lapic_handle_timer+0xb1 Xtimerint+0x8c 
__mtx_lock_sleep+0x168 zone_fetch_slab+0x47 zone_import+0x52 
zone_alloc_item+0x36 keg_alloc_slab+0x63 keg_fetch_slab+0x16e 
zone_fetch_slab+0x6e zone_import+0x52 uma_zalloc_arg+0x36e m_getm2+0x14f 
m_uiotombuf+0x64 sosend_generic+0x356 soo_write+0x42 dofilewrite+0x87

Some info below posible incorectly decoded.

(kgdb) thread 809
[Switching to thread 809 (Thread 100686)]
(kgdb) bt
#0  sched_switch (td=0xf8014485f500, newtd=0xf8011422b000, 
flags=) at /usr/src/sys/kern/sched_ule.c:1973
#1  0x804a8d92 in mi_switch (flags=, newtd=0x0) at 
/usr/src/sys/kern/kern_synch.c:455
#2  0x804a6bee in critical_exit () at 
/usr/src/sys/kern/kern_switch.c:218
#3  0x80771701 in lapic_handle_timer (frame=0xfe2021699340) at 
/usr/src/sys/x86/x86/local_apic.c:1184
#4  
#5  0x804de424 in lock_delay (la=) at 
/usr/src/sys/kern/subr_lock.c:127
#6  0x80484dc8 in __mtx_lock_sleep (c=, 
tid=18446735283061126400, opts=, file=, 
line=) at /usr/src/sys/kern/kern_mutex.c:514
#7  0x806a4257 in zone_fetch_slab (zone=0xf8207ffe6000, 
keg=0xf8207ffe7180, flags=1) at /usr/src/sys/vm/uma_core.c:2371
#8  0x806a4312 in zone_import (zone=, bucket=, max=, flags=) at 
/usr/src/sys/vm/uma_core.c:2501
#9  0x806a0986 in zone_alloc_item (zone=0xf8207ffe6000, udata=0x0, 
flags=1) at /usr/src/sys/vm/uma_core.c:2591
#10 0x806a2463 in keg_alloc_slab (keg=0xf8010f9ecd80, 
zone=0xf80114236000, wait=1) at /usr/src/sys/vm/uma_core.c:964
#11 0x806a48ce in keg_fetch_slab (keg=, zone=, flags=) at /usr/src/sys/vm/uma_core.c:2343
#12 0x806a427e in zone_fetch_slab (zone=, keg=, flags=) at /usr/src/sys/vm/uma_core.c:2375
#13 0x806a4312 in zone_import (zone=, bucket=, max=, flags=) at 
/usr/src/sys/vm/uma_core.c:2501
#14 0x806a147e in zone_alloc_bucket (flags=2, zone=, 
udata=) at /usr/src/sys/vm/uma_core.c:2531
#15 uma_zalloc_arg (zone=, udata=0xf8105a700300, flags=2) at 
/usr/src/sys/vm/uma_core.c:2257
#16 0x8048231f in m_getjcl (how=2, type=1, flags=, 
size=4096) at /usr/src/sys/kern/kern_mbuf.c:829
#17 m_getm2 (m=, len=, how=, 
type=, flags=) at 
/usr/src/sys/kern/kern_mbuf.c:861
#18 0x80516044 in m_uiotombuf (uio=0xf818dcfbaec0, how=60, 
len=, align=0, flags=0) at /usr/src/sys/kern/uipc_mbuf.c:1535
#19 0x8051ce56 in sosend_generic (so=, addr=0x0, 
uio=, top=, control=, 
flags=, td=) at 
/usr/src/sys/kern/uipc_socket.c:1332
#20 0x804fd872 in soo_write (fp=, 
uio=0xf818dcfbaec0, active_cred=, flags=, 
td=) at /usr/src/sys/kern/sys_socket.c:146
#21 0x804f5c97 in fo_write (fp=, uio=0xf818dcfbaec0, 
active_cred=0xc0, flags=0, td=) at /usr/src/sys/sys/file.h:311
#22 dofilewrite (td=0xf8014485f500, fd=1531, fp=0xf80ac09fa960, 
auio=0xf818dcfbaec0, offset=, flags=0) at 
/usr/src/sys/kern/sys_generic.c:590
#23 0x804f5978 in kern_writev (td=0xf8014485f500, fd=1531, 
auio=0xf818dcfbaec0) at /usr/src/sys/kern/sys_generic.c:506
#24 0x804f5be6 in sys_writev (td=0xf8014485f500, 
uap=0xfe2021699a40) at /usr/src/sys/kern/sys_generic.c:491
#25 0x806e4051 in syscallenter (td=0xf8014485f500, sa=) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#26 amd64_syscall (td=0xf8014485f500, traced=0) at 
/usr/src/sys/amd64/amd64/trap.c:942

# vmstat -M /var/crash/vmcore.1 -z| grep -i mbuf
mbuf_packet:256,

Re: 11.0 stuck on high network load

2016-10-07 Thread Slawa Olhovchenkov

On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:

>  Thanks again to Slawa, for his numerous debug reports and always
> questioning my explanations.  His last question directly led to this
> finding.  He is testing a quick workaround patch to check if there is more.

Thanks very match! You was very helpful, explaining detail of FreeBSD
TCP code and gave a lot of work to this issuse, I'm appreciate all
your help!
Thanks again!
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-10-06 Thread Slawa Olhovchenkov

On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote:

> 2. thread1:  In tcp_close() the inp is marked with INP_DROPPED flag, the
> process continues and calls INP_WUNLOCK() here:
> 
> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568

Look also to sys/netinet/tcp_timewait.c:488

And check other locks from r160549

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-28 Thread Slawa Olhovchenkov

On Wed, Sep 28, 2016 at 12:06:47PM +0200, Julien Charbon wrote:

> > Tracing command intr pid 12 tid 100026 td 0xf8011424b500
> > sched_switch() at 0x804c956d = sched_switch+0x6ad/frame 
> > 0xfe3876f0
> > mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe387720
> > critical_exit() at 0x804a6bee = critical_exit+0x7e/frame 
> > 0xfe387740
> > ipi_bitmap_handler() at 0x80775629 = ipi_bitmap_handler+0x79/frame 
> > 0xfe387780
> > Xipi_intr_bitmap_handler() at 0x806cc15e = 
> > Xipi_intr_bitmap_handler+0x8e/frame 0xfe387780
> > --- interrupt, rip = 0x80484c1f, rsp = 0xfe387850, rbp = 
> > 0xfe387850 ---
> > __mtx_lock_flags() at 0x80484c1f = __mtx_lock_flags+0x2f/frame 
> > 0xfe387850
> > sodealloc() at 0x8051b992 = sodealloc+0x32/frame 0xfe387890
> > tcp_close() at 0x80618150 = tcp_close+0xd0/frame 0xfe3878c0
> > tcp_timer_2msl() at 0x8061dda3 = tcp_timer_2msl+0x1f3/frame 
> > 0xfe3878f0
> > softclock_call_cc() at 0x804b4ca9 = softclock_call_cc+0x179/frame 
> > 0xfe3879c0
> > softclock() at 0x804b5034 = softclock+0x44/frame 0xfe3879e0
> > intr_event_execute_handlers() at 0x8046c605 = 
> > intr_event_execute_handlers+0x95/frame 0xfe387a20
> > ithread_loop() at 0x8046cc26 = ithread_loop+0xa6/frame 
> > 0xfe387a70
> > fork_exit() at 0x8046a211 = fork_exit+0x71/frame 0xfe387ab0
> > fork_trampoline() at 0x806cb50e = fork_trampoline+0xe/frame 
> > 0xfe387ab0
> > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
>  Nice stack traces, all threads are blocked in sodealloc() or soalloc()
> and if you look at how mtx_lock(_global_mtx) and
> mtx_unlock(_global_mtx) are used, it is hard to think about a
> scenario that can lead to this state.
> 
>  I am still trying to reproduce your issue, without success so far.

May be some hardware-related (low-speed CPU?).
Yesternight I am collect new stack traces and kernel dump.
May be I can see something?

db> ps
  pid  ppid  pgrp   uid   state   wmesg wchancmd
   12 0 0 0  RL  (threaded)  [intr]
100023   RunQ[swi4: clock (8)]
100107   Run CPU 8   [irq291: ix0:q2]
   11 0 0 0  RL  (threaded)  [idle]
100011   CanRun  [idle: cpu8]

cpuid= 8
dynamic pcpu = 0xfe201d69cf00
curthread= 0xf8012508d500: pid 12 "irq291: ix0:q2"
curpcb   = 0xfe2020ebcb80
fpcurthread  = none
idlethread   = 0xf8011422c500: tid 100011 "idle: cpu8"
curpmap  = 0x80d49998
tssp = 0x80d7fcd0
commontssp   = 0x80d7fcd0
rsp0 = 0xfe2020ebcb80
gs32p= 0x80d86528
ldt  = 0x80d86568
tss  = 0x80d86558


Tracing command nginx pid 1061 tid 101747 td 0xf8014b35b500
sched_switch() at 0x804c956d = sched_switch+0x6ad/frame 
0xfe2021b70330
mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe2021b70360
turnstile_wait() at 0x804ef177 = turnstile_wait+0x2a7/frame 
0xfe2021b703a0
__rw_wlock_hard() at 0x8049c314 = __rw_wlock_hard+0x94/frame 
0xfe2021b70430
in_lltable_lookup() at 0x80594823 = in_lltable_lookup+0x83/frame 
0xfe2021b70450
arpresolve() at 0x8058d2aa = arpresolve+0x9a/frame 0xfe2021b704b0
ether_output() at 0x805755e2 = ether_output+0x2f2/frame 
0xfe2021b70550
ip_output() at 0x805a4200 = ip_output+0x1390/frame 0xfe2021b706b0
tcp_output() at 0x806149d5 = tcp_output+0x17a5/frame 0xfe2021b70850
tcp_usr_disconnect() at 0x80620094 = tcp_usr_disconnect+0x74/frame 
0xfe2021b70880
soclose() at 0x8051c238 = soclose+0x38/frame 0xfe2021b708b0
_fdrop() at 0x8045639a = _fdrop+0x1a/frame 0xfe2021b708d0
closef() at 0x80458a53 = closef+0x1e3/frame 0xfe2021b70960
closefp() at 0x804567ad = closefp+0x7d/frame 0xfe2021b709a0
amd64_syscall() at 0x806e4051 = amd64_syscall+0x2c1/frame 
0xfe2021b70ab0
Xfast_syscall() at 0x806cb2bb = Xfast_syscall+0xfb/frame 
0xfe2021b70ab0
--- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8019dbeaa, rsp = 
0x7fffe6a8, rbp = 0x7fffe6c0 ---

Tracing command nginx pid 1060 tid 101749 td 0xf80126a53a00
sched_switch() at 0x804c956d = sched_switch+0x6ad/frame 
0xfe2021b7a240
mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe2021b7a270
turnstile_wait() at 0x804ef177 = turnstile_wait+0x2a7/frame 
0xfe2021b7a2b0
__rw_wlock_hard() at 0x8049c314 = __rw_wlock_hard+0x94/frame 
0xfe2021b7a340
in_lltable_lookup() at 0x80594823 = in_lltable_lookup+0x83/frame

Re: 11.0 stuck on high network load

2016-09-26 Thread Slawa Olhovchenkov

On Mon, Sep 26, 2016 at 11:33:12AM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote:
> > On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote:
> >>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
> >>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
> >>>>>  You can also use Dtrace and lockstat (especially with the lockstat -s
> >>>>> option):
> >>>>>
> >>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
> >>>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE
> >>>>>
> >>>>>  But I am less familiar with Dtrace/lockstat tools.
> >>>>
> >>>> I am still use old kernel and got lockdown again.
> >>>> Try using lockstat (I am save more output), interesting may be next:
> >>>>
> >>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 
> >>>> events/sec)
> >>>>
> >>>> ---
> >>>> Count indv cuml rcnt nsec Lock   Caller  
> >>>> 
> >>>> 140839  74%  74% 0.0024659 tcpinp 
> >>>> tcp_tw_2msl_scan+0xc6   
> >>>>
> >>>>   nsec -- Time Distribution -- count Stack   
> >>>> 
> >>>>   4096 |   913   tcp_twstart+0xa3
> >>>> 
> >>>>   8192 |   58191 
> >>>> tcp_do_segment+0x201f   
> >>>>  16384 |@@ 29594 tcp_input+0xe1c 
> >>>> 
> >>>>  32768 |   23447 ip_input+0x15f  
> >>>> 
> >>>>  65536 |@@@16197 
> >>>> 131072 |@  8674  
> >>>> 262144 |   3358  
> >>>> 524288 |   456   
> >>>>1048576 |   9 
> >>>> ---
> >>>> Count indv cuml rcnt nsec Lock   Caller  
> >>>> 
> >>>> 49180  26% 100% 0.0015929 tcpinp 
> >>>> tcp_tw_2msl_scan+0xc6   
> >>>>
> >>>>   nsec -- Time Distribution -- count Stack   
> >>>> 
> >>>>   4096 |   157   pfslowtimo+0x54 
> >>>> 
> >>>>   8192 |@@@24796 
> >>>> softclock_call_cc+0x179 
> >>>>  16384 |@@ 11223 softclock+0x44  
> >>>> 
> >>>>  32768 |   7426  
> >>>> intr_event_execute_handlers+0x95
> >>>>  65536 |@@ 3918  
> >>>> 131072 |   1363  
> >>>> 262144 |   278   
> >>>> 524288 |   19
> >>>> ---
> >>>
> >>>  This is interesting, it seems that you have two call paths competing
> >>> for INP locks here:
> >>>
> >>>  - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and
> >>>
> >>>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)
> >>
> >> My current hypothesis:
> >>
> >> nginx do write() (or may be close()?) to socket, kernel lock
> >> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
> >> CPU core and tcp_tw_2msl_scan infinity locked on same inp.
> >>
> >> In this case you modification can't help, before next try we need some
> >> like yeld().
> > 
> > Or may be locks leaks.
> > Or both.
> 
>  You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=0) is
> infinitely blocked on INP_WLOCK() by "something" (that could be related
> to write()).
> 
>  As I reached my limit of debugging without WITNESS, could you share

Re: nginx and FreeBSD11

2016-09-26 Thread Slawa Olhovchenkov

On Mon, Sep 26, 2016 at 06:20:42PM +0300, Konstantin Belousov wrote:

> On Thu, Sep 22, 2016 at 12:33:55PM +0300, Slawa Olhovchenkov wrote:
> > OK, try this patch.
> 
> Was the patch tested ?

No more AIO related issused/nginx core dumps.
I Can't get long uptime by other issuses (tcp locks and mbuf related)
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-26 Thread Slawa Olhovchenkov

On Mon, Sep 26, 2016 at 01:57:03PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote:
> > On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote:
> >> On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote:
> >>>
> >>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
> >>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
> >>>>>  You can also use Dtrace and lockstat (especially with the lockstat -s
> >>>>> option):
> >>>>>
> >>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
> >>>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE
> >>>>>
> >>>>>  But I am less familiar with Dtrace/lockstat tools.
> >>>>
> >>>> I am still use old kernel and got lockdown again.
> >>>> Try using lockstat (I am save more output), interesting may be next:
> >>>>
> >>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 
> >>>> events/sec)
> >>>>
> >>>> ---
> >>>> Count indv cuml rcnt nsec Lock   Caller  
> >>>> 
> >>>> 140839  74%  74% 0.0024659 tcpinp 
> >>>> tcp_tw_2msl_scan+0xc6   
> >>>>
> >>>>   nsec -- Time Distribution -- count Stack   
> >>>> 
> >>>>   4096 |   913   tcp_twstart+0xa3
> >>>> 
> >>>>   8192 |   58191 
> >>>> tcp_do_segment+0x201f   
> >>>>  16384 |@@ 29594 tcp_input+0xe1c 
> >>>> 
> >>>>  32768 |   23447 ip_input+0x15f  
> >>>> 
> >>>>  65536 |@@@16197 
> >>>> 131072 |@  8674  
> >>>> 262144 |   3358  
> >>>> 524288 |   456   
> >>>>1048576 |   9 
> >>>> ---
> >>>> Count indv cuml rcnt nsec Lock   Caller  
> >>>> 
> >>>> 49180  26% 100% 0.0015929 tcpinp 
> >>>> tcp_tw_2msl_scan+0xc6   
> >>>>
> >>>>   nsec -- Time Distribution -- count Stack   
> >>>> 
> >>>>   4096 |   157   pfslowtimo+0x54 
> >>>> 
> >>>>   8192 |@@@24796 
> >>>> softclock_call_cc+0x179 
> >>>>  16384 |@@ 11223 softclock+0x44  
> >>>> 
> >>>>  32768 |   7426  
> >>>> intr_event_execute_handlers+0x95
> >>>>  65536 |@@ 3918  
> >>>> 131072 |   1363  
> >>>> 262144 |   278   
> >>>> 524288 |   19
> >>>> ---
> >>>
> >>>  This is interesting, it seems that you have two call paths competing
> >>> for INP locks here:
> >>>
> >>>  - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and
> >>>
> >>>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)
> >>
> >> My current hypothesis:
> >>
> >> nginx do write() (or may be close()?) to socket, kernel lock
> >> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
> >> CPU core and tcp_tw_2msl_scan infinity locked on same inp.
> >>
> >> In this case you modification can't help, before next try we need some
> >> like yeld().
> > 
> > Or may be locks leaks.
> > Or both.
> 
>  Actually one extra debug thing you can do is launching lockstat with
> below extra options:
> 
>  -H For Hold lock stats
>  -P To get the overall time
>  -s 20  To get the stackstrace
> 
>  To see who is holding the INP lock for so long.  Thanks to Hiren for
> pointing the -H option.

At time of this graph I am collect output from `lockstat -PH -s 5
sleep 1` too and don't see any interesting -- I am think lock holded
before lockstat run don't detected and don't showed.

I still can show collected output, if you need. hundreds of lines.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-26 Thread Slawa Olhovchenkov

On Mon, Sep 26, 2016 at 11:33:12AM +0200, Julien Charbon wrote:

> >>>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)
> >>
> >> My current hypothesis:
> >>
> >> nginx do write() (or may be close()?) to socket, kernel lock
> >> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
> >> CPU core and tcp_tw_2msl_scan infinity locked on same inp.
> >>
> >> In this case you modification can't help, before next try we need some
> >> like yeld().
> > 
> > Or may be locks leaks.
> > Or both.
> 
>  You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=0) is
> infinitely blocked on INP_WLOCK() by "something" (that could be related
> to write()).
> 
>  As I reached my limit of debugging without WITNESS, could you share
> your /etc/sysctl.conf, /boot/loader.conf files?  And any specific
> configuration you have (like having a Nginx workers affinity, Nginx
> special options, etc.).  Like that I can try to reproduce it on releng/11.0.

I am use double socket server, E5-2620.
Double Intel 10G NIC, affinity to CPU 6..11.
Nginx affinity to CPU 0..5.
I.e. on CPU 0 only nginx worker affinity exist and NIC IRQ handler
activity only on CPU 6..11.

/boot/loader.conf:

kern.geom.label.gptid.enable="0"
zfs_load="YES"
 generated by conf.pl #
hw.memtest.tests=0
machdep.hyperthreading_allowed=0
kern.geom.label.disk_ident.enable=0
if_igb_load=yes
if_ix_load=yes
hw.ix.num_queues=3
hw.ix.rxd=4096
hw.ix.txd=4096
hw.ix.rx_process_limit=-1
hw.ix.tx_process_limit=-1
if_lagg_load=YES
net.link.lagg.default_use_flowid=0
accf_http_load=yes
aio_load=yes
cc_htcp_load=yes
kern.ipc.nmbclusters=1048576
net.inet.tcp.reass.maxsegments=32768
net.inet.tcp.hostcache.cachelimit=0
net.inet.tcp.hostcache.hashsize=32768
net.inet.tcp.syncache.hashsize=32768
#net.inet.tcp.tcbhashsize=262144
net.inet.tcp.tcbhashsize=65536
net.inet.tcp.maxtcptw=16384
kern.pin_default_swi=1
kern.pin_pcpu_swi=1
kern.hwpmc.nbuffers=131072
hw.cxgbe.qsize_rxq=16384
hw.cxgbe.qsize_txq=16384
hw.cxgbe.nrxq10g=3
kernel="kernel.VSTREAM"
kernels="kernel"
hw.mps.max_chains=3072
###
hw.vga.textmode=1
uhci_load=yes
ohci_load=yes
ehci_load=yes
xhci_load=yes
ukbd_load=yes
umass_load=yes
###
boot_multicons="YES"
boot_serial="YES"
comconsole_speed="115200"
comconsole_port=760
#console="comconsole,vidconsole"
console="vidconsole,comconsole"
hint.uart.0.flags="0x00"
hint.uart.1.flags="0x10"

/etc/sysctl.conf:

kern.random.sys.harvest.ethernet=0
kern.threads.max_threads_per_proc=2
net.inet.ip.maxfragpackets=32768
net.inet.ip.fastforwarding=1
kern.ipc.somaxconn=4096
kern.ipc.nmbjumbop=2097152
kern.ipc.maxsockbuf=16777216
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.sendspace=2097152
#net.inet.tcp.maxtcptw=444800
net.inet.tcp.fast_finwait2_recycle=1
net.inet.tcp.msl=1000
net.inet.tcp.cc.algorithm=htcp
net.inet.tcp.per_cpu_timers=1
#net.inet.tcp.syncookies=0
net.inet6.ip6.auto_linklocal=0
kern.maxfiles=30
kern.maxfilesperproc=8
#hw.intr_storm_threshold=9000
vfs.zfs.prefetch_disable=1
vfs.zfs.vdev.max_pending=1000
vfs.zfs.l2arc_noprefetch=0
vfs.zfs.l2arc_norw=0
vfs.zfs.l2arc_write_boost=134217728
vfs.zfs.l2arc_write_max=33554432
vfs.aio.max_aio_procs=512
vfs.aio.max_aio_queue_per_proc=8192
vfs.aio.max_aio_per_proc=8192
vfs.aio.max_aio_queue=65536
net.inet.tcp.finwait2_timeout=5000

kern.corefile=/tmp/%N.%P.core
kern.sugid_coredump=1

Now (after this nginx lockout) I am use you patch witch modification:
act return NULL at write lock and now see only mbuf-related work.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-26 Thread Slawa Olhovchenkov

On Mon, Sep 26, 2016 at 10:51:07AM +0200, Julien Charbon wrote:

> >  1049 kqread-  I  145:58.35 nginx: worker process (nginx)
> >  1050 kqread-  I  136:33.36 nginx: worker process (nginx)
> >  1051 kqread-  I  140:59.73 nginx: worker process (nginx)
> >  1052 kqread-  I  137:18.12 nginx: worker process (nginx)
> > 
> > pid 1046 is nginx running on CPU0 (affinity mask set).
> > 
> > # procstat -k -k 1046
> >   PIDTID COMM TDNAME   KSTACK
> >  1046 100686 nginx-mi_switch+0xd2 
> > critical_exit+0x7e lapic_handle_timer+0xb1 Xtimerint+0x8c 
> > __mtx_lock_sleep+0x168 zone_fetch_slab+0x47 zone_import+0x52 
> > zone_alloc_item+0x36 keg_alloc_slab+0x63 keg_fetch_slab+0x16e 
> > zone_fetch_slab+0x6e zone_import+0x52 uma_zalloc_arg+0x36e m_getm2+0x14f 
> > m_uiotombuf+0x64 sosend_generic+0x356 soo_write+0x42 dofilewrite+0x87
> > 
> > Tracing command nginx pid 1046 tid 100686 td 0xf8014485f500
> > sched_switch() at 0x804c956d = sched_switch+0x6ad/frame 
> > 0xfe20216992a0 /usr/src/sys/kern/sched_ule.c:1973
> > mi_switch() at 0x804a8d92 = mi_switch+0xd2/frame 0xfe20216992d0 
> > /usr/src/sys/kern/kern_synch.c:465
> > critical_exit() at 0x804a6bee = critical_exit+0x7e/frame 
> > 0xfe20216992f0 /usr/src/sys/kern/kern_switch.c:219
> > lapic_handle_timer() at 0x80771701 = lapic_handle_timer+0xb1/frame 
> > 0xfe2021699330 /usr/src/sys/x86/x86/local_apic.c:1185
> > Xtimerint() at 0x806cbbcc = Xtimerint+0x8c/frame 0xfe2021699330 
> > /usr/src/sys/amd64/amd64/apic_vector.S:135
> > --- interrupt, rip = 0x804de424, rsp = 0xfe2021699400, rbp = 
> > 0xfe2021699420 ---
> > lock_delay() at 0x804de424 = lock_delay+0x54/frame 
> > 0xfe2021699420 /usr/src/sys/kern/subr_lock.c:127
> > __mtx_lock_sleep() at 0x80484dc8 = __mtx_lock_sleep+0x168/frame 
> > 0xfe20216994a0 /usr/src/sys/kern/kern_mutex.c:512
> > zone_fetch_slab() at 0x806a4257 = zone_fetch_slab+0x47/frame 
> > 0xfe20216994e0 /usr/src/sys/vm/uma_core.c:2378
> > zone_import() at 0x806a4312 = zone_import+0x52/frame 
> > 0xfe2021699530 /usr/src/sys/vm/uma_core.c:2501
> > zone_alloc_item() at 0x806a0986 = zone_alloc_item+0x36/frame 
> > 0xfe2021699570 /usr/src/sys/vm/uma_core.c:2591
> > keg_alloc_slab() at 0x806a2463 = keg_alloc_slab+0x63/frame 
> > 0xfe20216995d0 /usr/src/sys/vm/uma_core.c:965
> > keg_fetch_slab() at 0x806a48ce = keg_fetch_slab+0x16e/frame 
> > 0xfe2021699620 /usr/src/sys/vm/uma_core.c:2349
> > zone_fetch_slab() at 0x806a427e = zone_fetch_slab+0x6e/frame 
> > 0xfe2021699660 /usr/src/sys/vm/uma_core.c:2375
> > zone_import() at 0x806a4312 = zone_import+0x52/frame 
> > 0xfe20216996b0 /usr/src/sys/vm/uma_core.c:2501 
> > uma_zalloc_arg() at 0x806a147e = uma_zalloc_arg+0x36e/frame 
> > 0xfe2021699720 /usr/src/sys/vm/uma_core.c:2531
> > m_getm2() at 0x8048231f = m_getm2+0x14f/frame 0xfe2021699790 
> > /usr/src/sys/kern/kern_mbuf.c:830
> > m_uiotombuf() at 0x80516044 = m_uiotombuf+0x64/frame 
> > 0xfe20216997e0 /usr/src/sys/kern/uipc_mbuf.c:1535
> > sosend_generic() at 0x8051ce56 = sosend_generic+0x356/frame 
> > 0xfe20216998a0
> > soo_write() at 0x804fd872 = soo_write+0x42/frame 0xfe20216998d0
> > dofilewrite() at 0x804f5c97 = dofilewrite+0x87/frame 
> > 0xfe2021699920
> > kern_writev() at 0x804f5978 = kern_writev+0x68/frame 
> > 0xfe2021699970
> > sys_writev() at 0x804f5be6 = sys_writev+0x36/frame 
> > 0xfe20216999a0
> > amd64_syscall() at 0x806e4051 = amd64_syscall+0x2c1/frame 
> > 0xfe2021699ab0
> > Xfast_syscall() at 0x806cb2bb = Xfast_syscall+0xfb/frame 
> > 0xfe2021699ab0
> > --- syscall (121, FreeBSD ELF64, sys_writev), rip = 0x8019cc6ba, rsp = 
> > 0x7fffd688, rbp = 0x7fffd6c0 ---
> 
>   This call stack is quite interesting:

>  1: A process is calling writev()
>  2: Kernel calls sosend_generic() that starts allocating memory
>  3: This allocation is then interrupted by the timer interrupt handler
> [that could actually trigger tcp_tw_2msl_scan(reuse=0)]
>  4: The timer interrupt handler seems to wait on sched_switch()

No, this is more interesting: double call (recuersion) to zone_import()!

>  And fun fact:  When sosend_generic() calls m_uiotombuf() it does not
> hold INP_WLOCK yet...

Yes, is not INP_WLOCK related, this is like next error.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-25 Thread Slawa Olhovchenkov

On Fri, Sep 23, 2016 at 10:16:56PM +0300, Slawa Olhovchenkov wrote:

> On Thu, Sep 22, 2016 at 01:20:45PM +0300, Slawa Olhovchenkov wrote:
> 
> > On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote:
> > 
> > > >>  These paths can indeed compete for the same INP lock, as both
> > > >> tcp_tw_2msl_scan() calls always start with the first inp found in
> > > >> twq_2msl list.  But in both cases, this first inp should be quickly 
> > > >> used
> > > >> and its lock released anyway, thus that could explain your situation it
> > > >> that the TCP stack is doing that all the time, for example:
> > > >>
> > > >>  - Let say that you are running out completely and constantly of tcptw,
> > > >> and then all connections transitioning to TIME_WAIT state are competing
> > > >> with the TIME_WAIT timeout scan that tries to free all the expired
> > > >> tcptw.  If the stack is doing that all the time, it can appear like
> > > >> "live" locked.
> > > >>
> > > >>  This is just an hypothesis and as usual might be a red herring.
> > > >> Anyway, could you run:
> > > >>
> > > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock'
> > > > 
> > > > ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
> > > > 
> > > > socket: 864, 4192664,   18604,   25348,49276158,   0,   > > > > 0
> > > > tcp_inpcb:  464, 4192664,   34226,   18702,49250593,   0,   > > > > 0
> > > > tcpcb: 1040, 4192665,   18424,   18953,49250593,   0,   > > > > 0
> > > > tcptw:   88,  16425,   15802, 623,14526919,   8,   0
> > > > tcpreass:40,  32800,  15,2285,  632381,   0,   0
> > > > 
> > > > In normal case tcptw is about 16425/600/900
> > > > 
> > > > And after `sysctl -a | grep tcp` system stuck on serial console and I 
> > > > am reset it.
> > > > 
> > > >>  Ideally, once when everything is ok, and once when you have the issue
> > > >> to see the differences (if any).
> > > >>
> > > >>  If it appears your are quite low in tcptw, and if you have enough
> > > >> memory, could you try increase the tcptw limit using sysctl
> > > > 
> > > > I think this is not eliminate stuck, just may do it less frequency
> > > 
> > >  You are right, it would just be a big hint that the tcp_tw_2msl_scan()
> > > contention hypothesis is the right one.  As I see you have plenty of
> > > memory on your server, thus could you try with:
> > > 
> > > net.inet.tcp.maxtcptw=4192665
> > > 
> > >  And see what happen. Just to validate this hypothesis.
> > 
> > This is bad way for validate, with maxtcptw=16384 happened is random
> > and can be waited for month. After maxtcptw=4192665 I am don't know
> > how long need to wait for verification this hypothesis.
> > 
> > More frequency (may be 3-5 times per day) happening less traffic drops
> > (not to zero for minutes). May be this caused also by contention in
> > tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating
> > CPU power nginx can't service connection and clients closed
> > connections and need more TIME_WAIT and can trigered
> > tcp_tw_2msl_scan(reuse=1). After this we can got live lock.
> > 
> > May be after I learning to catch and dignostic this validation is more
> > accurately.
> 
> Some more bits:
> 
> socket: 864, 4192664,   30806, 790,28524160,   0,   0
> ipq: 56,  32802,   0,1278,1022,   0,   0
> udp_inpcb:  464, 4192664,  44, 364,   14066,   0,   0
> udpcb:   32, 4192750,  44,3081,   14066,   0,   0
> tcp_inpcb:  464, 4192664,   38558, 378,28476709,   0,   0
> tcpcb: 1040, 4192665,   30690, 738,28476709,   0,   0
> tcptw:   88,  32805,7868, 772, 8412249,   0,   0
> 
> last pid: 49575;  load averages:  2.00,  2.05,  3.75up 1+01:12:08  
> 22:13:42
> 853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock
> CPU 0:   0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
> CPU 1:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 2:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> CPU 3:   0.0

Re: 11.0 stuck on high network load

2016-09-25 Thread Slawa Olhovchenkov

On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote:

> On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote:
> 
> > 
> >  Hi Slawa,
> > 
> > On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
> > > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
> > >>  You can also use Dtrace and lockstat (especially with the lockstat -s
> > >> option):
> > >>
> > >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
> > >> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE
> > >>
> > >>  But I am less familiar with Dtrace/lockstat tools.
> > > 
> > > I am still use old kernel and got lockdown again.
> > > Try using lockstat (I am save more output), interesting may be next:
> > > 
> > > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 
> > > events/sec)
> > > 
> > > ---
> > > Count indv cuml rcnt nsec Lock   Caller   
> > >
> > > 140839  74%  74% 0.0024659 tcpinp 
> > > tcp_tw_2msl_scan+0xc6   
> > > 
> > >   nsec -- Time Distribution -- count Stack
> > >
> > >   4096 |   913   tcp_twstart+0xa3 
> > >
> > >   8192 |   58191 
> > > tcp_do_segment+0x201f   
> > >  16384 |@@ 29594 tcp_input+0xe1c  
> > >
> > >  32768 |   23447 ip_input+0x15f   
> > >
> > >  65536 |@@@16197 
> > > 131072 |@  8674  
> > > 262144 |   3358  
> > > 524288 |   456   
> > >1048576 |   9 
> > > ---
> > > Count indv cuml rcnt nsec Lock   Caller   
> > >
> > > 49180  26% 100% 0.0015929 tcpinp 
> > > tcp_tw_2msl_scan+0xc6   
> > > 
> > >   nsec -- Time Distribution -- count Stack
> > >
> > >   4096 |   157   pfslowtimo+0x54  
> > >
> > >   8192 |@@@24796 
> > > softclock_call_cc+0x179 
> > >  16384 |@@ 11223 softclock+0x44   
> > >
> > >  32768 |   7426  
> > > intr_event_execute_handlers+0x95
> > >  65536 |@@ 3918  
> > > 131072 |   1363  
> > > 262144 |   278   
> > > 524288 |   19
> > > ---
> > 
> >  This is interesting, it seems that you have two call paths competing
> > for INP locks here:
> > 
> >  - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and
> > 
> >  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)
> 
> My current hypothesis:
> 
> nginx do write() (or may be close()?) to socket, kernel lock
> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
> CPU core and tcp_tw_2msl_scan infinity locked on same inp.
> 
> In this case you modification can't help, before next try we need some
> like yeld().

Or may be locks leaks.
Or both.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-23 Thread Slawa Olhovchenkov

On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
> > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
> >>  You can also use Dtrace and lockstat (especially with the lockstat -s
> >> option):
> >>
> >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
> >> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE
> >>
> >>  But I am less familiar with Dtrace/lockstat tools.
> > 
> > I am still use old kernel and got lockdown again.
> > Try using lockstat (I am save more output), interesting may be next:
> > 
> > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 
> > events/sec)
> > 
> > ---
> > Count indv cuml rcnt nsec Lock   Caller 
> >  
> > 140839  74%  74% 0.0024659 tcpinp tcp_tw_2msl_scan+0xc6 
> >   
> > 
> >   nsec -- Time Distribution -- count Stack  
> >  
> >   4096 |   913   tcp_twstart+0xa3   
> >  
> >   8192 |   58191 tcp_do_segment+0x201f  
> >  
> >  16384 |@@ 29594 tcp_input+0xe1c
> >  
> >  32768 |   23447 ip_input+0x15f 
> >  
> >  65536 |@@@16197 
> > 131072 |@  8674  
> > 262144 |   3358  
> > 524288 |   456   
> >1048576 |   9 
> > ---
> > Count indv cuml rcnt nsec Lock   Caller 
> >  
> > 49180  26% 100% 0.0015929 tcpinp tcp_tw_2msl_scan+0xc6  
> >  
> > 
> >   nsec -- Time Distribution -- count Stack  
> >  
> >   4096 |   157   pfslowtimo+0x54
> >  
> >   8192 |@@@24796 
> > softclock_call_cc+0x179 
> >  16384 |@@ 11223 softclock+0x44 
> >  
> >  32768 |   7426  
> > intr_event_execute_handlers+0x95
> >  65536 |@@ 3918  
> > 131072 |   1363  
> > 262144 |   278   
> > 524288 |   19
> > ---
> 
>  This is interesting, it seems that you have two call paths competing
> for INP locks here:
> 
>  - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and
> 
>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)

My current hypothesis:

nginx do write() (or may be close()?) to socket, kernel lock
first inp in V_twq_2msl, happen callout for pfslowtimo() on the same
CPU core and tcp_tw_2msl_scan infinity locked on same inp.

In this case you modification can't help, before next try we need some
like yeld().
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-23 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 01:20:45PM +0300, Slawa Olhovchenkov wrote:

> On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote:
> 
> > >>  These paths can indeed compete for the same INP lock, as both
> > >> tcp_tw_2msl_scan() calls always start with the first inp found in
> > >> twq_2msl list.  But in both cases, this first inp should be quickly used
> > >> and its lock released anyway, thus that could explain your situation it
> > >> that the TCP stack is doing that all the time, for example:
> > >>
> > >>  - Let say that you are running out completely and constantly of tcptw,
> > >> and then all connections transitioning to TIME_WAIT state are competing
> > >> with the TIME_WAIT timeout scan that tries to free all the expired
> > >> tcptw.  If the stack is doing that all the time, it can appear like
> > >> "live" locked.
> > >>
> > >>  This is just an hypothesis and as usual might be a red herring.
> > >> Anyway, could you run:
> > >>
> > >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock'
> > > 
> > > ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
> > > 
> > > socket: 864, 4192664,   18604,   25348,49276158,   0,   0
> > > tcp_inpcb:  464, 4192664,   34226,   18702,49250593,   0,   0
> > > tcpcb: 1040, 4192665,   18424,   18953,49250593,   0,   0
> > > tcptw:   88,  16425,   15802, 623,14526919,   8,   0
> > > tcpreass:40,  32800,  15,2285,  632381,   0,   0
> > > 
> > > In normal case tcptw is about 16425/600/900
> > > 
> > > And after `sysctl -a | grep tcp` system stuck on serial console and I am 
> > > reset it.
> > > 
> > >>  Ideally, once when everything is ok, and once when you have the issue
> > >> to see the differences (if any).
> > >>
> > >>  If it appears your are quite low in tcptw, and if you have enough
> > >> memory, could you try increase the tcptw limit using sysctl
> > > 
> > > I think this is not eliminate stuck, just may do it less frequency
> > 
> >  You are right, it would just be a big hint that the tcp_tw_2msl_scan()
> > contention hypothesis is the right one.  As I see you have plenty of
> > memory on your server, thus could you try with:
> > 
> > net.inet.tcp.maxtcptw=4192665
> > 
> >  And see what happen. Just to validate this hypothesis.
> 
> This is bad way for validate, with maxtcptw=16384 happened is random
> and can be waited for month. After maxtcptw=4192665 I am don't know
> how long need to wait for verification this hypothesis.
> 
> More frequency (may be 3-5 times per day) happening less traffic drops
> (not to zero for minutes). May be this caused also by contention in
> tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating
> CPU power nginx can't service connection and clients closed
> connections and need more TIME_WAIT and can trigered
> tcp_tw_2msl_scan(reuse=1). After this we can got live lock.
> 
> May be after I learning to catch and dignostic this validation is more
> accurately.

Some more bits:

socket: 864, 4192664,   30806, 790,28524160,   0,   0
ipq: 56,  32802,   0,1278,1022,   0,   0
udp_inpcb:  464, 4192664,  44, 364,   14066,   0,   0
udpcb:   32, 4192750,  44,3081,   14066,   0,   0
tcp_inpcb:  464, 4192664,   38558, 378,28476709,   0,   0
tcpcb: 1040, 4192665,   30690, 738,28476709,   0,   0
tcptw:   88,  32805,7868, 772, 8412249,   0,   0

last pid: 49575;  load averages:  2.00,  2.05,  3.75up 1+01:12:08  22:13:42
853 processes: 15 running, 769 sleeping, 35 waiting, 34 lock
CPU 0:   0.0% user,  0.0% nice,  0.0% system,  100% interrupt,  0.0% idle
CPU 1:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 2:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 3:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 4:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 5:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 6:   0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 7:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 8:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 9:   0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 10:  0.0% user,  0.0% nice,  0.4% system,  0.0% interr

Re: zvol clone diffs

2016-09-22 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 04:56:53PM +0500, Eugene M. Zheganin wrote:

> Hi.
> 
> I should mention from the start that this is a question about an
> engineering task, not a question about FreeBSD issue.
> 
> I have a set of zvol clones that I redistribute over iSCSI. Several
> Windows VMs use these clones as disks via their embedded iSCSI
> initiators (each clone represents a disk with an NTFS partition, is
> imported as a "foreign" disk and functions just fine). From my opinion,
> they should not have any need to do additional writes on these clones
> (each VM should only read data, from my point of view). But zfs shows
> they do, and sometimes they write a lot of data, so clearly facts and
> expactations differ a lot - obviously I didn't take something into
> accounting.

May be atime like on NTFS?

http://serverfault.com/questions/33932/how-do-you-disable-the-last-accessed-attribute-on-ntfs-windows
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-22 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 12:04:40PM +0200, Julien Charbon wrote:

> >>  These paths can indeed compete for the same INP lock, as both
> >> tcp_tw_2msl_scan() calls always start with the first inp found in
> >> twq_2msl list.  But in both cases, this first inp should be quickly used
> >> and its lock released anyway, thus that could explain your situation it
> >> that the TCP stack is doing that all the time, for example:
> >>
> >>  - Let say that you are running out completely and constantly of tcptw,
> >> and then all connections transitioning to TIME_WAIT state are competing
> >> with the TIME_WAIT timeout scan that tries to free all the expired
> >> tcptw.  If the stack is doing that all the time, it can appear like
> >> "live" locked.
> >>
> >>  This is just an hypothesis and as usual might be a red herring.
> >> Anyway, could you run:
> >>
> >> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock'
> > 
> > ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
> > 
> > socket: 864, 4192664,   18604,   25348,49276158,   0,   0
> > tcp_inpcb:  464, 4192664,   34226,   18702,49250593,   0,   0
> > tcpcb: 1040, 4192665,   18424,   18953,49250593,   0,   0
> > tcptw:   88,  16425,   15802, 623,14526919,   8,   0
> > tcpreass:40,  32800,  15,2285,  632381,   0,   0
> > 
> > In normal case tcptw is about 16425/600/900
> > 
> > And after `sysctl -a | grep tcp` system stuck on serial console and I am 
> > reset it.
> > 
> >>  Ideally, once when everything is ok, and once when you have the issue
> >> to see the differences (if any).
> >>
> >>  If it appears your are quite low in tcptw, and if you have enough
> >> memory, could you try increase the tcptw limit using sysctl
> > 
> > I think this is not eliminate stuck, just may do it less frequency
> 
>  You are right, it would just be a big hint that the tcp_tw_2msl_scan()
> contention hypothesis is the right one.  As I see you have plenty of
> memory on your server, thus could you try with:
> 
> net.inet.tcp.maxtcptw=4192665
> 
>  And see what happen. Just to validate this hypothesis.

This is bad way for validate, with maxtcptw=16384 happened is random
and can be waited for month. After maxtcptw=4192665 I am don't know
how long need to wait for verification this hypothesis.

More frequency (may be 3-5 times per day) happening less traffic drops
(not to zero for minutes). May be this caused also by contention in
tcp_tw_2msl_scan, but fast resolved (stochastic process). By eating
CPU power nginx can't service connection and clients closed
connections and need more TIME_WAIT and can trigered
tcp_tw_2msl_scan(reuse=1). After this we can got live lock.

May be after I learning to catch and dignostic this validation is more
accurately.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-22 Thread Slawa Olhovchenkov

On Wed, Sep 21, 2016 at 11:25:18PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote:
> > On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:
> >>  You can also use Dtrace and lockstat (especially with the lockstat -s
> >> option):
> >>
> >> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
> >> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE
> >>
> >>  But I am less familiar with Dtrace/lockstat tools.
> > 
> > I am still use old kernel and got lockdown again.
> > Try using lockstat (I am save more output), interesting may be next:
> > 
> > R/W writer spin on writer: 190019 events in 1.070 seconds (177571 
> > events/sec)
> > 
> > ---
> > Count indv cuml rcnt nsec Lock   Caller 
> >  
> > 140839  74%  74% 0.0024659 tcpinp tcp_tw_2msl_scan+0xc6 
> >   
> > 
> >   nsec -- Time Distribution -- count Stack  
> >  
> >   4096 |   913   tcp_twstart+0xa3   
> >  
> >   8192 |   58191 tcp_do_segment+0x201f  
> >  
> >  16384 |@@ 29594 tcp_input+0xe1c
> >  
> >  32768 |   23447 ip_input+0x15f 
> >  
> >  65536 |@@@16197 
> > 131072 |@  8674  
> > 262144 |   3358  
> > 524288 |   456   
> >1048576 |   9 
> > ---
> > Count indv cuml rcnt nsec Lock   Caller 
> >  
> > 49180  26% 100% 0.0015929 tcpinp tcp_tw_2msl_scan+0xc6  
> >  
> > 
> >   nsec -- Time Distribution -- count Stack  
> >  
> >   4096 |   157   pfslowtimo+0x54
> >  
> >   8192 |@@@24796 
> > softclock_call_cc+0x179 
> >  16384 |@@ 11223 softclock+0x44 
> >  
> >  32768 |   7426  
> > intr_event_execute_handlers+0x95
> >  65536 |@@ 3918  
> > 131072 |   1363  
> > 262144 |   278   
> > 524288 |   19
> > ---
> 
>  This is interesting, it seems that you have two call paths competing
> for INP locks here:
> 
>  - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and
> 
>  - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1)

I think same.

>  These paths can indeed compete for the same INP lock, as both
> tcp_tw_2msl_scan() calls always start with the first inp found in
> twq_2msl list.  But in both cases, this first inp should be quickly used
> and its lock released anyway, thus that could explain your situation it
> that the TCP stack is doing that all the time, for example:
> 
>  - Let say that you are running out completely and constantly of tcptw,
> and then all connections transitioning to TIME_WAIT state are competing
> with the TIME_WAIT timeout scan that tries to free all the expired
> tcptw.  If the stack is doing that all the time, it can appear like
> "live" locked.
> 
>  This is just an hypothesis and as usual might be a red herring.
> Anyway, could you run:
> 
> $ vmstat -z | head -2; vmstat -z | grep -E 'tcp|sock'

ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP

socket: 864, 4192664,   18604,   25348,49276158,   0,   0
tcp_inpcb:  464, 4192664,   34226,   18702,49250593,   0,   0
tcpcb: 1040, 4192665,   18424,   18953,49250593,   0,   0
tcptw:   88,  16425,   15802, 623,14526919,   8,   0
tcpreass:40,  32800,  15,2285,  632381,   0,   0

In normal case tcptw is about 16425/600/900

And after `sysctl -a | grep tcp` system stuck on serial console and I am reset 
it.

>  Ideally, once when everything is ok, and once when you have the issue
> to see the differences (if any).
> 
>  If it appears your are quite low in tcptw, and if you have enough
>

Re: 11.0 stuck on high network load

2016-09-22 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 11:28:38AM +0200, Julien Charbon wrote:

> >>> What purpose to not skip locked tcptw in this loop?
> >>
> >>  If I understand your question correctly:  According to your pmcstat
> >> result, tcp_tw_2msl_scan() currently struggles with a write lock
> >> (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is
> >> INP_WLOCK.  No sign of contention on TW_RLOCK(V_tw_lock) currently.
> > 
> > As I see in code, tcp_tw_2msl_scan got first node from V_twq_2msl and
> > need got RW lock on inp w/o alternates. Can tcp_tw_2msl_scan skip current 
> > node
> > and go to next node in V_twq_2msl list if current node locked by some
> > reasson?
> 
>  Interesting question indeed:  It is not optimal that all simultaneous
> calls to tcp_tw_2msl_scan() compete for the same oldest tcptw.  The next
> tcptws in the list are certainly old enough also.
> 
>  Let me see if I can make a simple change that makes kernel threads
> calling tcp_tw_2msl_scan() at same time to work on a different old
> enough tcptws.  So far, I found only solutions quite complex to implement.

Simple solution is skip in each thread ncpu elemnts and skip curent
cpu number elements at start, if I understund you correctly.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: nginx and FreeBSD11

2016-09-22 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 11:53:20AM +0300, Konstantin Belousov wrote:

> On Thu, Sep 22, 2016 at 11:34:24AM +0300, Slawa Olhovchenkov wrote:
> > On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote:
> > 
> > > On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote:
> > > > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote:
> > > > > Below is, I believe, the committable fix, of course supposing that
> > > > > the patch above worked. If you want to retest it on stable/11, ignore
> > > > > efirt.c chunks.
> > > > 
> > > > and remove patch w/ spinlock?
> > > Yes.
> > 
> > What you prefer now -- I am test spinlock patch or this patch?
> > For success in any case need wait 2-3 days.
> 
> If you already run previous (spinlock) version for 1 day, then finish
> with it. I am confident that spinlock version results are indicative for
> the refined patch as well.
> 
> If you did not applied the spinlock variant at all, there is no reason to
> spend efforts on it, use the patch I sent today.

No, I am did not applied the spinlock variant at all.
OK, try this patch.
Do you still need first 100 lines from verbose boot?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: nginx and FreeBSD11

2016-09-22 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 11:27:40AM +0300, Konstantin Belousov wrote:

> On Thu, Sep 22, 2016 at 11:25:27AM +0300, Slawa Olhovchenkov wrote:
> > On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote:
> > > Below is, I believe, the committable fix, of course supposing that
> > > the patch above worked. If you want to retest it on stable/11, ignore
> > > efirt.c chunks.
> > 
> > and remove patch w/ spinlock?
> Yes.

What you prefer now -- I am test spinlock patch or this patch?
For success in any case need wait 2-3 days.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: nginx and FreeBSD11

2016-09-22 Thread Slawa Olhovchenkov

On Thu, Sep 22, 2016 at 10:59:33AM +0300, Konstantin Belousov wrote:

> On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote:
> > > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c
> > > > index a23468e..f754652 100644
> > > > --- a/sys/vm/vm_map.c
> > > > +++ b/sys/vm/vm_map.c
> > > > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm)
> > > > if (oldvm == newvm)
> > > > return;
> > > >  
> > > > +   spinlock_enter();
> > > > /*
> > > >  * Point to the new address space and refer to it.
> > > >  */
> > > > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm)
> > > >  
> > > > /* Activate the new mapping. */
> > > > pmap_activate(curthread);
> > > > +   spinlock_exit();
> > > >  
> > > > /* Remove the daemon's reference to the old address space. */
> > > > KASSERT(oldvm->vm_refcnt > 1,
> Did you tested the patch ?

I am now installed it.
For success test need 2-3 days.
If test failed result may be quickly.

> Below is, I believe, the committable fix, of course supposing that
> the patch above worked. If you want to retest it on stable/11, ignore
> efirt.c chunks.

and remove patch w/ spinlock?

> diff --git a/sys/amd64/amd64/efirt.c b/sys/amd64/amd64/efirt.c
> index f1d67f7..c883af8 100644
> --- a/sys/amd64/amd64/efirt.c
> +++ b/sys/amd64/amd64/efirt.c
> @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$");
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -301,6 +302,17 @@ efi_enter(void)
>   PMAP_UNLOCK(curpmap);
>   return (error);
>   }
> +
> + /*
> +  * IPI TLB shootdown handler invltlb_pcid_handler() reloads
> +  * %cr3 from the curpmap->pm_cr3, which would disable runtime
> +  * segments mappings.  Block the handler's action by setting
> +  * curpmap to impossible value.  See also comment in
> +  * pmap.c:pmap_activate_sw().
> +  */
> + if (pmap_pcid_enabled && !invpcid_works)
> + PCPU_SET(curpmap, NULL);
> +
>   load_cr3(VM_PAGE_TO_PHYS(efi_pml4_page) | (pmap_pcid_enabled ?
>   curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0));
>   /*
> @@ -317,7 +329,9 @@ efi_leave(void)
>  {
>   pmap_t curpmap;
>  
> - curpmap = PCPU_GET(curpmap);
> + curpmap = >p_vmspace->vm_pmap;
> + if (pmap_pcid_enabled && !invpcid_works)
> + PCPU_SET(curpmap, curpmap);
>   load_cr3(curpmap->pm_cr3 | (pmap_pcid_enabled ?
>   curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0));
>   if (!pmap_pcid_enabled)
> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
> index 63042e4..59e1b67 100644
> --- a/sys/amd64/amd64/pmap.c
> +++ b/sys/amd64/amd64/pmap.c
> @@ -6842,6 +6842,7 @@ pmap_activate_sw(struct thread *td)
>  {
>   pmap_t oldpmap, pmap;
>   uint64_t cached, cr3;
> + register_t rflags;
>   u_int cpuid;
>  
>   oldpmap = PCPU_GET(curpmap);
> @@ -6865,16 +6866,43 @@ pmap_activate_sw(struct thread *td)
>   pmap == kernel_pmap,
>   ("non-kernel pmap thread %p pmap %p cpu %d pcid %#x",
>   td, pmap, cpuid, pmap->pm_pcids[cpuid].pm_pcid));
> +
> + /*
> +  * If the INVPCID instruction is not available,
> +  * invltlb_pcid_handler() is used for handle
> +  * invalidate_all IPI, which checks for curpmap ==
> +  * smp_tlb_pmap.  Below operations sequence has a
> +  * window where %CR3 is loaded with the new pmap's
> +  * PML4 address, but curpmap value is not yet updated.
> +  * This causes invltlb IPI handler, called between the
> +  * updates, to execute as NOP, which leaves stale TLB
> +  * entries.
> +  *
> +  * Note that the most typical use of
> +  * pmap_activate_sw(), from the context switch, is
> +  * immune to this race, because interrupts are
> +  * disabled (while the thread lock is owned), and IPI
> +  * happends after curpmap is updated.  Protect other
> +  * callers in a similar way, by disabling interrupts
> +  * around the %cr3 register reload and curpmap
> +  * assignment.
> +  */
> + if (!invpcid_works)
> + rflags = intr_disable();
> +
>   if (!cached || (cr3 & ~CR3_PCID_MASK) != pmap->pm_cr3) {
>   load_cr3(pmap->pm_cr3 | pmap->pm_pcids[cpuid].pm_pcid |
>   cached);
>   if (cached)
>   PCPU_INC(pm_save_cnt);
>   }
> + PCPU_SET(curpmap, pmap);
> + if (!invpcid_works)
> + intr_restore(rflags);
>   } else if (cr3 != pmap->pm_cr3) {
>   load_cr3(pmap->pm_cr3);
> + PCPU_SET(curpmap, pmap);
>   }
> -

Re: 11.0 stuck on high network load

2016-09-21 Thread Slawa Olhovchenkov

On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:

> 
>  You can also use Dtrace and lockstat (especially with the lockstat -s
> option):
> 
> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks
> https://www.freebsd.org/cgi/man.cgi?query=lockstat=FreeBSD+11.0-RELEASE
> 
>  But I am less familiar with Dtrace/lockstat tools.

I am still use old kernel and got lockdown again.
Try using lockstat (I am save more output), interesting may be next:

R/W writer spin on writer: 190019 events in 1.070 seconds (177571 events/sec)

---
Count indv cuml rcnt nsec Lock   Caller  
140839  74%  74% 0.0024659 tcpinp tcp_tw_2msl_scan+0xc6   

  nsec -- Time Distribution -- count Stack   
  4096 |   913   tcp_twstart+0xa3
  8192 |   58191 tcp_do_segment+0x201f   
 16384 |@@ 29594 tcp_input+0xe1c 
 32768 |   23447 ip_input+0x15f  
 65536 |@@@16197 
131072 |@  8674  
262144 |   3358  
524288 |   456   
   1048576 |   9 
---
Count indv cuml rcnt nsec Lock   Caller  
49180  26% 100% 0.0015929 tcpinp tcp_tw_2msl_scan+0xc6   

  nsec -- Time Distribution -- count Stack   
  4096 |   157   pfslowtimo+0x54 
  8192 |@@@24796 softclock_call_cc+0x179 
 16384 |@@ 11223 softclock+0x44  
 32768 |   7426  
intr_event_execute_handlers+0x95
 65536 |@@ 3918  
131072 |   1363  
262144 |   278   
524288 |   19
---


> >>  #1. Try above kernel options at least once, and see what you can get.
> > 
> > OK, I am try this after some time.
> > 
> >>  #2. If #1 is a total failure try below patch:  It won't solve anything,
> >> it just makes tcp_tw_2msl_scan() less greedy when there is contention on
> >> the INP write lock.  If it makes the debugging more feasible, continue
> >> to #3.
> > 
> > OK, thanks.
> > What purpose to not skip locked tcptw in this loop?
> 
>  If I understand your question correctly:  According to your pmcstat
> result, tcp_tw_2msl_scan() currently struggles with a write lock
> (__rw_wlock_hard) and the only write lock used tcp_tw_2msl_scan() is
> INP_WLOCK.  No sign of contention on TW_RLOCK(V_tw_lock) currently.
> 
> 51.86%  [2413083]  lock_delay @ /boot/kernel.VSTREAM/kernel
>  100.0%  [2413083]   __rw_wlock_hard
>   100.0%  [2413083]tcp_tw_2msl_scan
> 
> --
> Julien
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 11.0 stuck on high network load

2016-09-21 Thread Slawa Olhovchenkov

On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 9/20/16 10:26 PM, Slawa Olhovchenkov wrote:
> > On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote:
> >> On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote:
> >>> On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote:
> >>>>
> >>>>> @ CPU_CLK_UNHALTED_CORE [4653445 samples]
> >>>>>
> >>>>> 51.86%  [2413083]  lock_delay @ /boot/kernel.VSTREAM/kernel
> >>>>>  100.0%  [2413083]   __rw_wlock_hard
> >>>>>   100.0%  [2413083]tcp_tw_2msl_scan
> >>>>>99.99%  [2412958] pfslowtimo
> >>>>> 100.0%  [2412958]  softclock_call_cc
> >>>>>  100.0%  [2412958]   softclock
> >>>>>   100.0%  [2412958]intr_event_execute_handlers
> >>>>>100.0%  [2412958] ithread_loop
> >>>>> 100.0%  [2412958]  fork_exit
> >>>>>00.01%  [125] tcp_twstart
> >>>>> 100.0%  [125]  tcp_do_segment
> >>>>>  100.0%  [125]   tcp_input
> >>>>>   100.0%  [125]ip_input
> >>>>>100.0%  [125] swi_net
> >>>>> 100.0%  [125]  intr_event_execute_handlers
> >>>>>  100.0%  [125]   ithread_loop
> >>>>>   100.0%  [125]fork_exit
> >>>>
> >>>>  The only write lock tcp_tw_2msl_scan() tries to get is a
> >>>> INP_WLOCK(inp).  Thus here, tcp_tw_2msl_scan() seems to be stuck
> >>>> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls
> >>>> tcp_tw_2msl_scan() at high rate but this will be quite unexpected).
> >>>>
> >>>>  Thus my hypothesis is that something is holding the INP_WLOCK and not
> >>>> releasing it, and tcp_tw_2msl_scan() is spinning on it.
> >>>>
> >>>>  If you can, could you compile the kernel with below options:
> >>>>
> >>>> optionsDDB # Support DDB.
> >>>> optionsDEADLKRES   # Enable the deadlock resolver
> >>>> optionsINVARIANTS  # Enable calls of extra sanity
> >>>> checking
> >>>> optionsINVARIANT_SUPPORT   # Extra sanity checks of internal
> >>>> structures, required by INVARIANTS
> >>>> optionsWITNESS # Enable checks to detect
> >>>> deadlocks and cycles
> >>>> optionsWITNESS_SKIPSPIN# Don't run witness on spinlocks
> >>>> for speed
> >>>
> >>> Currently this host run with 100% CPU load (on all cores), i.e.
> >>> enabling WITNESS will be significant drop performance.
> >>> Can I use only some subset of options?
> >>>
> >>> Also, I can some troubles to DDB enter in this case.
> >>> May be kgdb will be success (not tryed yet)?
> >>
> >>  If these kernel options will certainly slow down your kernel, they also
> >> might found the root cause of your issue before reaching the point where
> >> you have 100% cpu load on all cores (thanks to INVARIANTS).  I would
> >> suggest:
> > 
> > Hmmm, may be I am not clarified.
> > This host run at peak hours with 100% CPU load as normal operation,
> > this is for servering 2x10G, this is CPU load not result of lock
> > issuse, this is not us case. And this is because I am fear to enable
> > WITNESS -- I am fear drop performance.
> > 
> > This lock issuse happen irregulary and may be caused by other issuse
> > (nginx crashed). In this case about 1/3 cores have 100% cpu load,
> > perhaps by this lock -- I am can trace only from one core and need
> > more then hour for this (may be on other cores different trace, I
> > can't guaranted anything).
> 
>  I see, especially if you are running in production WITNESS might indeed
> be not practical for you.  In this case, I would suggest before doing
> WITNESS and still get more information to:
> 
>  #0: Do a lock profiling:
> 
> https://www.freebsd.org/cgi/man.cgi?query=LOCK_PROFILING
> 
> options LOCK_PROFILING
> 
>  Example of usage:
> 
> # Run
> $ sudo sysctl debug.lock.prof.enable=1
> $ sleep 10
> $ sudo sysctl debug.lock.prof.enable=0
> 
> # Get results
> $ sysctl debug.lock.

Re: nginx and FreeBSD11

2016-09-20 Thread Slawa Olhovchenkov

On Tue, Sep 20, 2016 at 04:00:10PM -0600, Warner Losh wrote:

> >> > > Is this sandy bridge ?
> >> >
> >> > Sandy Bridge EP
> >> >
> >> > > Show me first 100 lines of the verbose dmesg,
> >> >
> >> > After day or two, after end of this test run -- I am need to enable 
> >> > verbose.
> >> >
> >> > > I want to see cpu features lines.  In particular, does you CPU support
> >> > > the INVPCID feature.
> >> >
> >> > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU)
> >> >   Origin="GenuineIntel"  Id=0x206d7  Family=0x6  Model=0x2d  Stepping=7
> >> >   
> >> > Features=0xbfebfbff
> >> >   
> >> > Features2=0x1fbee3ff
> >> >   AMD Features=0x2c100800
> >> >   AMD Features2=0x1
> >> >   XSAVE Features=0x1
> >> >   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
> >> >   TSC: P-state invariant, performance statistics
> >> >
> >> > I am don't see this feature before E5v3:
> >> >
> >> > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU)
> >> >   Origin="GenuineIntel"  Id=0x306e4  Family=0x6  Model=0x3e  Stepping=4
> >> >   
> >> > Features=0xbfebfbff
> >> >   
> >> > Features2=0x7fbee3ff
> >> >   AMD Features=0x2c100800
> >> >   AMD Features2=0x1
> >> >   Structured Extended Features=0x281
> >> >   XSAVE Features=0x1
> >> >   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
> >> >   TSC: P-state invariant, performance statistics
> >> >
> >> > (don't run 11.0 on this CPU)
> >> Ok.
> >>
> >> >
> >> > CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU)
> >> >   Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
> >> >   
> >> > Features=0xbfebfbff
> >> >   
> >> > Features2=0x7ffefbff
> >> >   AMD Features=0x2c100800
> >> >   AMD Features2=0x21
> >> >   Structured Extended 
> >> > Features=0x37ab
> >> >   XSAVE Features=0x1
> >> >   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
> >> >   TSC: P-state invariant, performance statistics
> >> >
> >> > (11.0 run w/o this issuse)
> >> Do you mean that similarly configured nginx+aio do not demonstrate the 
> >> corruption on this machine ?
> >
> > Yes.
> > But different storage configuration and different pattern load.
> >
> > Also 11.0 run w/o this issuse on
> >
> > CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (2200.04-MHz K8-class CPU)
> >   Origin="GenuineIntel"  Id=0x406f1  Family=0x6  Model=0x4f  Stepping=1
> >   
> > Features=0xbfebfbff
> >   
> > Features2=0x7ffefbff
> >   AMD Features=0x2c100800
> >   AMD Features2=0x121
> >   Structured Extended 
> > Features=0x21cbfbb
> >   XSAVE Features=0x1
> >   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
> >   TSC: P-state invariant, performance statistics
> >
> > PS: all systems is dual-cpu.
> 
> Does this mean 2 cores or two sockets? We've seen a similar hang with
> the following CPU:

two sockets. not sure how this impotant, just for record.
you system also w/o INVPCID feature (as kib question).
may be you case also will be resolved by vm.pmap.pcid_enabled=0?

> CPU: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (2700.06-MHz K8-class CPU)
>   Origin="GenuineIntel"  Id=0x306e4  Family=0x6  Model=0x3e  Stepping=4
>  
> Features=0xbfebfbff
>  
> Features2=0x7fbee3ff
>   AMD Features=0x2c100800
>   AMD Features2=0x1
>   Structured Extended

Re: nginx and FreeBSD11

2016-09-20 Thread Slawa Olhovchenkov

On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote:

> On Tue, Sep 20, 2016 at 11:38:54PM +0300, Slawa Olhovchenkov wrote:
> > On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote:
> > 
> > > On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote:
> > > > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote:
> > > > 
> > > > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote:
> > > > > 
> > > > > > > > If this panics, then vmspace_switch_aio() is not working for
> > > > > > > > some reason.
> > > > > > > 
> > > > > > > I am try using next DTrace script:
> > > > > > > 
> > > > > > > #pragma D option dynvarsize=64m
> > > > > > > 
> > > > > > > int req[struct vmspace  *, void *];
> > > > > > > self int trace;
> > > > > > > 
> > > > > > > syscall:freebsd:aio_read:entry
> > > > > > > {
> > > > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct 
> > > > > > > aiocb));
> > > > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = 
> > > > > > > curthread->td_proc->p_pid; 
> > > > > > > }
> > > > > > > 
> > > > > > > fbt:kernel:aio_process_rw:entry
> > > > > > > {
> > > > > > > self->job = args[0];
> > > > > > > self->trace = 1;
> > > > > > > }
> > > > > > > 
> > > > > > > fbt:kernel:aio_process_rw:return
> > > > > > > /self->trace/
> > > > > > > {
> > > > > > > req[self->job->userproc->p_vmspace, 
> > > > > > > self->job->uaiocb.aio_buf] = 0;
> > > > > > > self->job = 0;
> > > > > > > self->trace = 0;
> > > > > > > }
> > > > > > > 
> > > > > > > fbt:kernel:vn_io_fault:entry
> > > > > > > /self->trace && !req[curthread->td_proc->p_vmspace, 
> > > > > > > args[1]->uio_iov[0].iov_base]/
> > > > > > > {
> > > > > > > this->buf = args[1]->uio_iov[0].iov_base;
> > > > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, 
> > > > > > > curthread->td_proc->p_vmspace, this->buf, 
> > > > > > > req[curthread->td_proc->p_vmspace, this->buf]);
> > > > > > > }
> > > > > > > ===
> > > > > > > 
> > > > > > > And don't got any messages near nginx core dump.
> > > > > > > What I can check next?
> > > > > > > May be check context/address space switch for kernel process?
> > > > > > 
> > > > > > Which CPU are you using?
> > > > > 
> > > > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class 
> > > > > CPU)
> > > Is this sandy bridge ?
> > 
> > Sandy Bridge EP
> > 
> > > Show me first 100 lines of the verbose dmesg,
> > 
> > After day or two, after end of this test run -- I am need to enable verbose.
> > 
> > > I want to see cpu features lines.  In particular, does you CPU support
> > > the INVPCID feature.
> > 
> > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU)
> >   Origin="GenuineIntel"  Id=0x206d7  Family=0x6  Model=0x2d  Stepping=7
> >   
> > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
> >   
> > Features2=0x1fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX>
> >   AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
> >   AMD Features2=0x1
> >   XSAVE Features=0x1
> >   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
> >   TSC: P-state invariant, performance statistics
> > 
> > I am don't see this feature before E5v3:
> > 
> > CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU)
> >

Re: nginx and FreeBSD11

2016-09-20 Thread Slawa Olhovchenkov

On Tue, Sep 20, 2016 at 11:19:25PM +0300, Konstantin Belousov wrote:

> On Tue, Sep 20, 2016 at 10:20:53PM +0300, Slawa Olhovchenkov wrote:
> > On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote:
> > 
> > > On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote:
> > > 
> > > > > > If this panics, then vmspace_switch_aio() is not working for
> > > > > > some reason.
> > > > > 
> > > > > I am try using next DTrace script:
> > > > > 
> > > > > #pragma D option dynvarsize=64m
> > > > > 
> > > > > int req[struct vmspace  *, void *];
> > > > > self int trace;
> > > > > 
> > > > > syscall:freebsd:aio_read:entry
> > > > > {
> > > > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct 
> > > > > aiocb));
> > > > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = 
> > > > > curthread->td_proc->p_pid; 
> > > > > }
> > > > > 
> > > > > fbt:kernel:aio_process_rw:entry
> > > > > {
> > > > > self->job = args[0];
> > > > > self->trace = 1;
> > > > > }
> > > > > 
> > > > > fbt:kernel:aio_process_rw:return
> > > > > /self->trace/
> > > > > {
> > > > > req[self->job->userproc->p_vmspace, 
> > > > > self->job->uaiocb.aio_buf] = 0;
> > > > > self->job = 0;
> > > > > self->trace = 0;
> > > > > }
> > > > > 
> > > > > fbt:kernel:vn_io_fault:entry
> > > > > /self->trace && !req[curthread->td_proc->p_vmspace, 
> > > > > args[1]->uio_iov[0].iov_base]/
> > > > > {
> > > > > this->buf = args[1]->uio_iov[0].iov_base;
> > > > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, 
> > > > > curthread->td_proc->p_vmspace, this->buf, 
> > > > > req[curthread->td_proc->p_vmspace, this->buf]);
> > > > > }
> > > > > ===
> > > > > 
> > > > > And don't got any messages near nginx core dump.
> > > > > What I can check next?
> > > > > May be check context/address space switch for kernel process?
> > > > 
> > > > Which CPU are you using?
> > > 
> > > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU)
> Is this sandy bridge ?

Sandy Bridge EP

> Show me first 100 lines of the verbose dmesg,

After day or two, after end of this test run -- I am need to enable verbose.

> I want to see cpu features lines.  In particular, does you CPU support
> the INVPCID feature.

CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.05-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x206d7  Family=0x6  Model=0x2d  Stepping=7
  
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  
Features2=0x1fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1
  XSAVE Features=0x1
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics

I am don't see this feature before E5v3:

CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz (2600.06-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306e4  Family=0x6  Model=0x3e  Stepping=4
  
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  
Features2=0x7fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1
  Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS>
  XSAVE Features=0x1
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics

(don't run 11.0 on this CPU)

CPU: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (2600.05-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x306f2  Family=0x6  Model=0x3f  Stepping=2
  
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE&g

Re: 11.0 stuck on high network load

2016-09-20 Thread Slawa Olhovchenkov

On Tue, Sep 20, 2016 at 10:00:25PM +0200, Julien Charbon wrote:

> 
>  Hi Slawa,
> 
> On 9/19/16 10:43 PM, Slawa Olhovchenkov wrote:
> > On Mon, Sep 19, 2016 at 10:32:13PM +0200, Julien Charbon wrote:
> >>
> >>> @ CPU_CLK_UNHALTED_CORE [4653445 samples]
> >>>
> >>> 51.86%  [2413083]  lock_delay @ /boot/kernel.VSTREAM/kernel
> >>>  100.0%  [2413083]   __rw_wlock_hard
> >>>   100.0%  [2413083]tcp_tw_2msl_scan
> >>>99.99%  [2412958] pfslowtimo
> >>> 100.0%  [2412958]  softclock_call_cc
> >>>  100.0%  [2412958]   softclock
> >>>   100.0%  [2412958]intr_event_execute_handlers
> >>>100.0%  [2412958] ithread_loop
> >>> 100.0%  [2412958]  fork_exit
> >>>00.01%  [125] tcp_twstart
> >>> 100.0%  [125]  tcp_do_segment
> >>>  100.0%  [125]   tcp_input
> >>>   100.0%  [125]ip_input
> >>>100.0%  [125] swi_net
> >>> 100.0%  [125]  intr_event_execute_handlers
> >>>  100.0%  [125]   ithread_loop
> >>>   100.0%  [125]fork_exit
> >>
> >>  The only write lock tcp_tw_2msl_scan() tries to get is a
> >> INP_WLOCK(inp).  Thus here, tcp_tw_2msl_scan() seems to be stuck
> >> spinning on INP_WLOCK (or pfslowtimo() is going crazy and calls
> >> tcp_tw_2msl_scan() at high rate but this will be quite unexpected).
> >>
> >>  Thus my hypothesis is that something is holding the INP_WLOCK and not
> >> releasing it, and tcp_tw_2msl_scan() is spinning on it.
> >>
> >>  If you can, could you compile the kernel with below options:
> >>
> >> optionsDDB # Support DDB.
> >> optionsDEADLKRES   # Enable the deadlock resolver
> >> optionsINVARIANTS  # Enable calls of extra sanity
> >> checking
> >> optionsINVARIANT_SUPPORT   # Extra sanity checks of internal
> >> structures, required by INVARIANTS
> >> optionsWITNESS # Enable checks to detect
> >> deadlocks and cycles
> >> optionsWITNESS_SKIPSPIN# Don't run witness on spinlocks
> >> for speed
> > 
> > Currently this host run with 100% CPU load (on all cores), i.e.
> > enabling WITNESS will be significant drop performance.
> > Can I use only some subset of options?
> > 
> > Also, I can some troubles to DDB enter in this case.
> > May be kgdb will be success (not tryed yet)?
> 
>  If these kernel options will certainly slow down your kernel, they also
> might found the root cause of your issue before reaching the point where
> you have 100% cpu load on all cores (thanks to INVARIANTS).  I would
> suggest:

Hmmm, may be I am not clarified.
This host run at peak hours with 100% CPU load as normal operation,
this is for servering 2x10G, this is CPU load not result of lock
issuse, this is not us case. And this is because I am fear to enable
WITNESS -- I am fear drop performance.

This lock issuse happen irregulary and may be caused by other issuse
(nginx crashed). In this case about 1/3 cores have 100% cpu load,
perhaps by this lock -- I am can trace only from one core and need
more then hour for this (may be on other cores different trace, I
can't guaranted anything).

>  #1. Try above kernel options at least once, and see what you can get.

OK, I am try this after some time.

>  #2. If #1 is a total failure try below patch:  It won't solve anything,
> it just makes tcp_tw_2msl_scan() less greedy when there is contention on
> the INP write lock.  If it makes the debugging more feasible, continue
> to #3.

OK, thanks.
What purpose to not skip locked tcptw in this loop?

> diff --git a/sys/netinet/tcp_timewait.c b/sys/netinet/tcp_timewait.c
> index a8b78f9..4206ea3 100644
> --- a/sys/netinet/tcp_timewait.c
> +++ b/sys/netinet/tcp_timewait.c
> @@ -701,34 +701,42 @@ tcp_tw_2msl_scan(int reuse)
> in_pcbref(inp);
> TW_RUNLOCK(V_tw_lock);
> 
> +retry:
> if (INP_INFO_TRY_RLOCK(_tcbinfo)) {
> 
> -   INP_WLOCK(inp);
> -   tw = intotw(inp);
> -   if (in_pcbrele_wlocked(inp)) {
> -   KASSERT(tw == NULL, ("%s: held last inp "
> -   "reference but tw not NULL", __func__));
> -   INP_INFO_RUNLOCK(_tcbinfo);
> -

Re: nginx and FreeBSD11

2016-09-20 Thread Slawa Olhovchenkov

On Tue, Sep 20, 2016 at 09:52:44AM +0300, Slawa Olhovchenkov wrote:

> On Mon, Sep 19, 2016 at 06:05:46PM -0700, John Baldwin wrote:
> 
> > > > If this panics, then vmspace_switch_aio() is not working for
> > > > some reason.
> > > 
> > > I am try using next DTrace script:
> > > 
> > > #pragma D option dynvarsize=64m
> > > 
> > > int req[struct vmspace  *, void *];
> > > self int trace;
> > > 
> > > syscall:freebsd:aio_read:entry
> > > {
> > > this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb));
> > > req[curthread->td_proc->p_vmspace, this->aio.aio_buf] = 
> > > curthread->td_proc->p_pid; 
> > > }
> > > 
> > > fbt:kernel:aio_process_rw:entry
> > > {
> > > self->job = args[0];
> > > self->trace = 1;
> > > }
> > > 
> > > fbt:kernel:aio_process_rw:return
> > > /self->trace/
> > > {
> > > req[self->job->userproc->p_vmspace, self->job->uaiocb.aio_buf] = 
> > > 0;
> > > self->job = 0;
> > > self->trace = 0;
> > > }
> > > 
> > > fbt:kernel:vn_io_fault:entry
> > > /self->trace && !req[curthread->td_proc->p_vmspace, 
> > > args[1]->uio_iov[0].iov_base]/
> > > {
> > > this->buf = args[1]->uio_iov[0].iov_base;
> > > printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp, 
> > > curthread->td_proc->p_vmspace, this->buf, 
> > > req[curthread->td_proc->p_vmspace, this->buf]);
> > > }
> > > ===
> > > 
> > > And don't got any messages near nginx core dump.
> > > What I can check next?
> > > May be check context/address space switch for kernel process?
> > 
> > Which CPU are you using?
> 
> CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz (2000.04-MHz K8-class CPU)
> 
> > Perhaps try disabling PCID support (I think vm.pmap.pcid_enabled=0 from
> > loader prompt or loader.conf)?  (Wondering if pmap_activate() is somehow 
> > not switching)

I am need some more time to test (day or two), but now this is like
workaround/solution: 12h runtime and peak hour w/o nginx crash.
(vm.pmap.pcid_enabled=0 in loader.conf).
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

1 2 3 4 >

1 - 100 of 336 matches

Mail list logo