[RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: txringid - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the txringid field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. Any comments ? --HPS === sys/net/if.h == --- sys/net/if.h (revision 270138) +++ sys/net/if.h (local) @@ -239,6 +239,7 @@ #define IFCAP_RXCSUM_IPV6 0x20 /* can offload checksum on IPv6 RX */ #define IFCAP_TXCSUM_IPV6 0x40 /* can offload checksum on IPv6 TX */ #define IFCAP_HWSTATS 0x80 /* manages counters internally */ +#define IFCAP_HWTXRINGS 0x100 /* hardware supports TX rings */ #define IFCAP_HWCSUM_IPV6 (IFCAP_RXCSUM_IPV6 | IFCAP_TXCSUM_IPV6) === sys/netinet/in.c == --- sys/netinet/in.c (revision 270138) +++ sys/netinet/in.c (local) @@ -42,6 +42,7 @@ #include sys/malloc.h #include sys/priv.h #include sys/socket.h +#include sys/socketvar.h #include sys/jail.h #include sys/kernel.h #include sys/proc.h @@ -201,9 +202,23 @@ struct in_ifaddr *ia; int error; - if (ifp == NULL) - return (EADDRNOTAVAIL); + if (ifp == NULL) { + struct inpcb *inp; + switch (cmd) { + case SIOCSTXRINGID: + inp = sotoinpcb(so); + if (inp == NULL) +return (EINVAL); + INP_WLOCK(inp); + inp-inp_txringid = *(unsigned *)data; + INP_WUNLOCK(inp); + return (0); + default: + return (EADDRNOTAVAIL); + } + } + /* * Filter out 4 ioctls we implement directly. Forward the rest * to specific functions and ifp-if_ioctl(). === sys/netinet/in_pcb.h == --- sys/netinet/in_pcb.h (revision 270138) +++ sys/netinet/in_pcb.h (local) @@ -46,6 +46,7 @@ #ifdef _KERNEL #include sys/lock.h #include sys/rwlock.h +#include sys/mbuf.h #include net/vnet.h #include vm/uma.h #endif @@ -177,7 +178,8 @@ u_char inp_ip_ttl; /* (i) time to live proto */ u_char inp_ip_p; /* (c) protocol proto */ u_char inp_ip_minttl; /* (i) minimum TTL or drop */ - uint32_t inp_flowid; /* (x) flow id / queue id */ + m_flowid_t inp_flowid; /* (x) flow ID */ + m_txringid_t inp_txringid; /* (x) transmit ring ID */ u_int inp_refcount; /* (i) refcount */ void *inp_pspare[5]; /* (x) route caching / general use */ uint32_t inp_flowtype; /* (x) M_HASHTYPE value */ === sys/netinet/in_var.h == --- sys/netinet/in_var.h (revision 270138) +++ sys/netinet/in_var.h (local) @@ -33,6 +33,7 @@ #ifndef _NETINET_IN_VAR_H_ #define _NETINET_IN_VAR_H_ +#include sys/mbuf.h #include sys/queue.h #include sys/fnv_hash.h #include sys/tree.h @@ -81,6 +82,18 @@ struct sockaddr_in ifra_mask; int ifra_vhid; }; + +struct in_ratectlreq { + char ifreq_name[IFNAMSIZ]; + m_txringid_t tx_ring_id; + uint32_t min_bytes_per_interval; + uint32_t max_bytes_per_interval; + uint32_t micros_per_interval; + uint32_t mode; +#define IN_RATECTLREQ_MODE_FIXED 0 /* min rate = max rate */ +#define IN_RATECTLREQ_MODE_AUTOMATIC 1 /* bounded by min/max */ +}; + /* * Given a pointer to an in_ifaddr (ifaddr), * return a pointer to the addr as a sockaddr_in. === sys/netinet/ip_output.c == --- sys/netinet/ip_output.c (revision 270138) +++ sys/netinet/ip_output.c (local) @@ -145,6 +145,7 @@ if (inp != NULL) { INP_LOCK_ASSERT(inp); M_SETFIB(m, inp-inp_inc.inc_fibnum); + m-m_pkthdr.txringid = inp-inp_txringid; if (inp-inp_flags (INP_HW_FLOWID|INP_SW_FLOWID)) { m-m_pkthdr.flowid = inp-inp_flowid; M_HASHTYPE_SET(m, inp-inp_flowtype); === sys/netinet6/in6.c == --- sys/netinet6/in6.c (revision 270138) +++ sys/netinet6/in6.c (local) @@ -235,6 +235,23 @@ int error; u_long ocmd = cmd; + if (ifp == NULL) { + struct inpcb *inp; + + switch (cmd) { + case SIOCSTXRINGID: + inp = sotoinpcb(so); + if (inp == NULL) +
Re: android bsd connectivity tools etc ?
Shane Ambler free...@shaneware.biz wrote: da0 at umass-sim1 bus 1 scbus5 target 0 lun 0 da0: Android Adapter 0100 Removable Direct Access SCSI-2 device cd1 at umass-sim1 bus 1 scbus5 target 0 lun 1 cd1: Android Adapter 0100 Removable CD-ROM SCSI-2 device da1 at umass-sim1 bus 1 scbus5 target 0 lun 2 da1: Android Adapter 0100 Removable Direct Access SCSI-2 device On the phone I get a message to turn on usb mass storage after which I can mount_msdosfs /dev/da0s1 and get access to the sdcard in the phone. It looks like mass storage was hidden in 4.0 and maybe removed after 4.2. Try searching the android app store for usb mass storage. With Android 4.0.4: | umass3: Mass Storage on usbus1 | umass3: SCSI over Bulk-Only; quirks = 0x4000 | umass3:6:3:-1: Attached to scbus6 | da2 at umass-sim3 bus 3 scbus6 target 0 lun 0 | da2: USB 2.0 USB Flash Driver 0100 Removable Direct Access SCSI-2 device | da2: 40.000MB/s transfers | da3 at umass-sim3 bus 3 scbus6 target 0 lun 1 | da3: USB 2.0 USB Flash Driver 0100 Removable Direct Access SCSI-2 device | da3: 40.000MB/s transfers which correspond to the 'internal sdcard' and external sdcard. cheers, Jamie ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: android bsd connectivity tools etc ?
Julian H. Stacey j...@berklix.com wrote: Hi, Any tips for Android / FreeBSD BSD tools for connectivity etc ? I mainly use nfs / ssh (dropbear) / scp for connectivity over IPv6 to my local FreeBSD server. It works quite well - I even have automated cron rsync deduped backups! NFS is used for mounting my media onto /sdcard/Videos /sdcard/Music /sdcard/Pictures Not all androids come with nfs in the kernel though, Cheers, Jamie ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky h...@selasky.org wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: txringid - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the txringid field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. the patch seems to include only part of the generic code (ie no ioctls for manipulating the rates, no backend code). Do i miss something ? I have a few comments/concerns: + looks like flowid and txringid are overlapped in scope, both will be used (in the backend) to select a specific tx queue. I don't have a solution but would like to know how do you plan to address this -- does one have priority over the other, etc. + related to the above, a (possibly unavoidable) side effect of this type of changes is that mbufs explode with custom fields, so if we could perhaps make one between flowid and txringid, that would be useful. + is there a way to avoid the replicated code for SIOCSTXRINGID (the ioctl handler, i suppose). Maybe make one function and call it from both ipv4 and ipv6, assuming there aren't other places like this. + i am not particularly happy about the explosion of ioctls for setting and getting rates. Next we'll want to add scheduling, and intervals, and queue sizes and so on. For these commands outside the critical path it would be preferable a single command with an extensible structure. Bikeshed material i am sure. cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
Hi Luigi, On 08/20/14 11:32, Luigi Rizzo wrote: On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky h...@selasky.org wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: txringid - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the txringid field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. the patch seems to include only part of the generic code (ie no ioctls for manipulating the rates, no backend code). Do i miss something ? The IOCTLs for managing the rates are: SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL And they go to the if_ioctl callback. I have a few comments/concerns: + looks like flowid and txringid are overlapped in scope, both will be used (in the backend) to select a specific tx queue. I don't have a solution but would like to know how do you plan to address this -- does one have priority over the other, etc. Not 100% . In some cases the flowID is used differently than the txringid, though it might be possible to join the two. Would need to investigate current users of the flow ID. + related to the above, a (possibly unavoidable) side effect of this type of changes is that mbufs explode with custom fields, so if we could perhaps make one between flowid and txringid, that would be useful. Right, but ratecontrol is an in-general useful feature, especially for high throughput networks, or do you think otherwise? + is there a way to avoid the replicated code for SIOCSTXRINGID (the ioctl handler, i suppose). Maybe make one function and call it from both ipv4 and ipv6, assuming there aren't other places like this. Yes, could do that. + i am not particularly happy about the explosion of ioctls for setting and getting rates. Next we'll want to add scheduling, and intervals, and queue sizes and so on. For these commands outside the critical path it would be preferable a single command with an extensible structure. Bikeshed material i am sure. There is only one IOCTL in the critical path and that is the IOCTL to change or update the TX ring ID. The other IOCTLs are in the non-critical path towards the if_ioctl() callback. If we can merge the flowID and the txringid into one field, would it be acceptable to add an IOCTL to read/write this value for all sockets? --HPS ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky h...@selasky.org wrote: Hi Luigi, On 08/20/14 11:32, Luigi Rizzo wrote: On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky h...@selasky.org wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. ... the patch seems to include only part of the generic code (ie no ioctls for manipulating the rates, no backend code). Do i miss something ? The IOCTLs for managing the rates are: SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL And they go to the if_ioctl callback. i really think these new 'advanced' features should go through some ethtool-like API, not more ioctls. We have a strong need to design and implement such an API also to have a uniform mechanism to manipulate rss, queues and other NIC features. ... I have a few comments/concerns: + looks like flowid and txringid are overlapped in scope, both will be used (in the backend) to select a specific tx queue. I don't have a solution but would like to know how do you plan to address this -- does one have priority over the other, etc. Not 100% . In some cases the flowID is used differently than the txringid, though it might be possible to join the two. Would need to investigate current users of the flow ID. in some 10G drivers i have seen, at the driver level the flowid is used on the tx path to assign packets to a given tx queue, generally to improve cpu affinity. Of course some applications may want a true flow classifier so they do not have to re-do the classification multiple times. But then, we have a ton of different classifiers with the same need -- e.g. ipfw dynamic rules, dummynet pipe/queue id, divert ports... Pipes are stored in mtags, which are very expensive so i do see a point in embedding them in the mbufs, it's just that going this path there is no end to the list. + related to the above, a (possibly unavoidable) side effect of this type of changes is that mbufs explode with custom fields, so if we could perhaps make one between flowid and txringid, that would be useful. Right, but ratecontrol is an in-general useful feature, especially for high throughput networks, or do you think otherwise? of course i think the feature is useful, but see the previous point. We should find a way to manage it (and others) that does not pollute or require continuous changes to the struct mbuf. + i am not particularly happy about the explosion of ioctls for setting and getting rates. Next we'll want to add scheduling, and intervals, and queue sizes and so on. For these commands outside the critical path it would be preferable a single command with an extensible structure. Bikeshed material i am sure. There is only one IOCTL in the critical path and that is the IOCTL to change or update the TX ring ID. The other IOCTLs are in the non-critical path towards the if_ioctl() callback. If we can merge the flowID and the txringid into one field, would it be acceptable to add an IOCTL to read/write this value for all sockets? see above. i'd prefer an ethtool-like solution. cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
On Wed, Aug 20, 2014 at 7:41 AM, Luigi Rizzo ri...@iet.unipi.it wrote: On Wed, Aug 20, 2014 at 3:29 PM, Hans Petter Selasky h...@selasky.org wrote: Hi Luigi, On 08/20/14 11:32, Luigi Rizzo wrote: On Wed, Aug 20, 2014 at 9:34 AM, Hans Petter Selasky h...@selasky.org wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. ... the patch seems to include only part of the generic code (ie no ioctls for manipulating the rates, no backend code). Do i miss something ? The IOCTLs for managing the rates are: SIOCARATECTL, SIOCSRATECTL, SIOCGRATECTL and SIOCDRATECTL And they go to the if_ioctl callback. i really think these new 'advanced' features should go through some ethtool-like API, not more ioctls. We have a strong need to design and implement such an API also to have a uniform mechanism to manipulate rss, queues and other NIC features. There is no ethtool equivalent yet, but exposing them through a sysctl is definitely the place to start before putting it straight in to ifconfig. The ifnet API is already a bit of a mess. ... I have a few comments/concerns: + looks like flowid and txringid are overlapped in scope, both will be used (in the backend) to select a specific tx queue. I don't have a solution but would like to know how do you plan to address this -- does one have priority over the other, etc. Not 100% . In some cases the flowID is used differently than the txringid, though it might be possible to join the two. Would need to investigate current users of the flow ID. in some 10G drivers i have seen, at the driver level the flowid is used on the tx path to assign packets to a given tx queue, generally to improve cpu affinity. Of course some applications may want a true flow classifier so they do not have to re-do the classification multiple times. But then, we have a ton of different classifiers with the same need -- e.g. ipfw dynamic rules, dummynet pipe/queue id, divert ports... Pipes are stored in mtags, which are very expensive so i do see a point in embedding them in the mbufs, it's just that going this path there is no end to the list. The purpose of the flowid was to enforce packet ordering on transmit while being large enough to store a RSS hash, potentially allowing input consumers to use it to semi-uniquely label (srcip, srcport, dstip, dstport) tuples. It seems to that the txringid would be almost entirely redundant. Why not just let users set the flowid? If we can merge the flowID and the txringid into one field, would it be acceptable to add an IOCTL to read/write this value for all sockets? That sounds reasonable - although I have not thought through all the implications. -K ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[CFT] SSP Package Repository available
On 9/21/2013 5:49 AM, Bryan Drewery wrote: Ports now support enabling Stack Protector [1] support on FreeBSD 10 i386 and amd64, and older releases on amd64 only currently. Support may be added for earlier i386 releases once all ports properly respect LDFLAGS. To enable, just add WITH_SSP=yes to your make.conf and rebuild all ports. The default SSP_CLFAGS is -fstack-protector, but -fstack-protector-all may optionally be set instead. Please help test this on your system. We would like to eventually enable this by default, but need to identify any major ports that have run-time issues due to it. [1] https://en.wikipedia.org/wiki/Buffer_overflow_protection We have not had any feedback on this yet and want to get it enabled by default for ports and packages. We now have a repository that you can use rather than the default to help test. We need your help to identify any issues before switching the default. This repository is available for: head 10.0 9.1,9.2,9.3 It is not available for 8.4. If someone is willing to test on 8.4 I will build a repository for it. Place this in /usr/local/etc/pkgs/repos/FreeBSD_ssp.conf: FreeBSD: { enabled: no } FreeBSD_ssp: { url: pkg+http://pkg.FreeBSD.org/${ABI}/ssp;, mirror_type: srv, signature_type: fingerprints, fingerprints: /usr/share/keys/pkg, enabled: yes } Once that is done you should force reinstall packages from this repository: pkg update pkg upgrade -f Thanks for your help! Bryan Drewery On behalf of portmgr. signature.asc Description: OpenPGP digital signature
RFC: Remove pty(4)
One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. -- Davide There are no solved problems; there are only problems that are more or less solved -- Henri Poincare ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: Remove pty(4)
On 8/20/14 11:00 AM, Davide Italiano wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. I don't think that we want to break userland apps pre-7.x. Do you mean just jails are broken? Or is all pre-7.x compat? I believe either is dicey. What is the reason for getting rid of cloning? What is the difficulty in maintaining the old interface? -Alfred ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: Remove pty(4)
On 08/20/14 20:00, Davide Italiano wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. Hi, There is support for creating character device aliases, regarding paired devices. BTW: Can an application using libcuse fix this issue in userspace? --HPS ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: Remove pty(4)
On 8/20/14, 11:00 AM, Davide Italiano wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. I agree with Alfred. breaking old jails is a no no. many people still use them. Me included. I sometimes run jails back to 1.1, not everything works, but enough does to build 1.1 binaries for certain devices. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: Remove pty(4)
On Aug 20, 2014, at 11:05, Alfred Perlstein bri...@mu.org wrote: On 8/20/14 11:00 AM, Davide Italiano wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. I don't think that we want to break userland apps pre-7.x. Do you mean just jails are broken? Or is all pre-7.x compat? I believe either is dicey. What is the reason for getting rid of cloning? What is the difficulty in maintaining the old interface? Doing this would also break login shells, xterms, etc, right? Some companies I worked for built their appliance products on newer OSes, and they were based off of 6 and 7. This seems like something that deserves being tossed into the compat layer if it's something that can be converted over to the new interface. Thanks! -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: Remove pty(4)
On 8/20/2014 1:13 PM, Julian Elischer wrote: On 8/20/14, 11:00 AM, Davide Italiano wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. I agree with Alfred. breaking old jails is a no no. many people still use them. Me included. I sometimes run jails back to 1.1, not everything works, but enough does to build 1.1 binaries for certain devices. +1 to keeping compat. -- Regards, Bryan Drewery signature.asc Description: OpenPGP digital signature
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
On 08/20/14 00:34, Hans Petter Selasky wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: txringid - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the txringid field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. Any comments ? Here are some thoughts. The first two bullets cover relatively minor issues, the rest are more important. - All of the mbuf pkthdr fields today have the same meaning no matter what the context. It is not clear what txringid's global meaning is. Is it even possible for driver foo to interpret it the same way as driver bar? What if the number of rings are different, or if the ring at the particular index for foo is setup differently than the ring at that same index for bar? You are attempting to influence the driver's txq selection and traditionally the mbuf's flowid has been used for this purpose. Have you considered allowing the user to set the flowid directly? And mark it as such via a new rsstype so the kernel will leave it alone. - uint32_t - m_flowid_t is plain gratuitous. Now we need to include mbuf.h in more places just to get this definition. What's the advantage of this? style(9) isn't too fond of typedefs either. Also, drivers *do* need to know the width of the flowid. At least lagg(4) looks at the high bits of the flowid (see flowid_shift in lagg). How high it can go depends on the width of the flowid. - Interfaces can come and go, routes can change, and so the relationship between an inpcb and txringid is not stable at all. What happens when the outbound route for an inpcb changes? - The in_ratectlreq structure that you propose is inadequate in its current form. For example, cxgbe's hardware can do rate limiting on a per-ring as well as per-connection basis, and it allows for pps, bandwidth, or min-max limits. I think this is the critical piece that we NIC maintainers must agree on before any code hits the core kernel: how to express a rate-limit policy in a standard way and allow for hardware assistance opportunistically. ipfw(4)'s dummynet is probably interested in this part too, so it's great that Luigi is paying attention to this thread. - The RATECTL ioctls deal with in_ratectlreq so we need to standardize the ratectlreq structure before these ioctls can be considered generic ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a private ioctl to frob cxgbe's per-queue rate-limiters. I did not want to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c ioctls. Now we have multiple drivers with i2c and melifaro@ is doing the right thing by promoting these private ioctls to a standard ifnet ioctl. Have you considered a private mlxtool as a stop gap measure? To summarize my take on all of this: we need a standard ratectlreq structure, a standard way to associate an inpcb with one, and a standard way to pass on this info to if_transmit. After all this is in place we could even have a dummynet-ish software layer that implements rate limiters when the underlying hardware offers no assistance. Regards, Navdeep ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
On 08/20/14 20:44, Navdeep Parhar wrote: On 08/20/14 00:34, Hans Petter Selasky wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: txringid - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the txringid field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. Any comments ? Here are some thoughts. The first two bullets cover relatively minor issues, the rest are more important. - All of the mbuf pkthdr fields today have the same meaning no matter what the context. It is not clear what txringid's global meaning is. Is it even possible for driver foo to interpret it the same way as driver bar? What if the number of rings are different, or if the ring at the particular index for foo is setup differently than the ring at that same index for bar? You are attempting to influence the driver's txq selection and traditionally the mbuf's flowid has been used for this purpose. Have you considered allowing the user to set the flowid directly? And mark it as such via a new rsstype so the kernel will leave it alone. Hi, At work so to speak, we have tried to make a simple approach that will not break existing code, without trying to optimise the possibilities and reduce memory footprint. - uint32_t - m_flowid_t is plain gratuitous. Now we need to include mbuf.h in more places just to get this definition. What's the advantage of this? style(9) isn't too fond of typedefs either. Also, drivers *do* need to know the width of the flowid. At least lagg(4) looks at the high bits of the flowid (see flowid_shift in lagg). How high it can go depends on the width of the flowid. The flowid should be typedef'ed. Else how can you know its type passing flowid along function arguments and so on? - Interfaces can come and go, routes can change, and so the relationship between an inpcb and txringid is not stable at all. What happens when the outbound route for an inpcb changes? This is managed separately by a daemon or such. The problem about using the inpcb approach which you are suggesting, is that you limit the rate control feature to traffic which is bound by sockets. Can your way of doing rate control be useful to non-socket based firewall applications, for example? You also assume a 1:1 mapping between inpcb and the flowID, right. What about M:N mappings, where multiple streams should share the same flowID, because it makes more sense? - The in_ratectlreq structure that you propose is inadequate in its current form. For example, cxgbe's hardware can do rate limiting on a per-ring as well as per-connection basis, and it allows for pps, bandwidth, or min-max limits. I think this is the critical piece that we NIC maintainers must agree on before any code hits the core kernel: how to express a rate-limit policy in a standard way and allow for hardware assistance opportunistically. ipfw(4)'s dummynet is probably interested in this part too, so it's great that Luigi is paying attention to this thread. My in_ratectlreq is a work in progress. - The RATECTL ioctls deal with in_ratectlreq so we need to standardize the ratectlreq structure before these ioctls can be considered generic ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a private ioctl to frob cxgbe's per-queue rate-limiters. I did not want to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c ioctls. Now we have multiple drivers with i2c and melifaro@ is doing the right thing by promoting these private ioctls to a standard ifnet ioctl. Have you considered a private mlxtool as a stop gap measure? It might end that we need to create our own tool for this, having vendor specific IOCTLs, if we cannot agree how to do this in a general way. To summarize my take on all of this: we need a standard ratectlreq structure, Agree. a standard way to associate an inpcb with one, Maybe. and a standard way to pass on this info to if_transmit. Agree. After all
Re: RFC: Remove pty(4)
On Wed, Aug 20, 2014 at 11:00:14AM -0700, Davide Italiano wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. FreeBSD EOL, but still working. # uname -a FreeBSD XXX 5.4-STABLE FreeBSD 5.4-STABLE #3: Fri Mar 29 13:52:44 MSK 2013 slw@:/usr/obj/usr/src/sys/RADIUS i386 And may be exist software for this version, don't exist for modern OS. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. Waht about porting software, relaying on /dev/ptyXX and /dev/ttyXX? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
On 08/20/14 12:25, Hans Petter Selasky wrote: On 08/20/14 20:44, Navdeep Parhar wrote: On 08/20/14 00:34, Hans Petter Selasky wrote: Hi, A month has passed since the last e-mail on this topic, and in the meanwhile some new patches have been created and tested: Basically the approach has been changed a little bit: - The creation of hardware transmit rings has been made independent of the TCP stack. This allows firewall applications to forward traffic into hardware transmit rings aswell, and not only native TCP applications. This should be one more reason to get the feature into the kernel. - A hardware transmit ring basically can have two modes: FIXED-RATE or AUTOMATIC-RATE. In the fixed rate mode all traffic is sent at a fixed bytes per second rate. In the automatic mode you can configure a time after which the TX queue must be empty. The hardware driver uses this to configure the actual rate. In automatic mode you can also set an upper and lower transmit rate limit. - The MBUF has got a new field in the packet header: txringid - IOCTLs for TCP v4 and v6 sockets has been updated to allow setting of the txringid field in the mbuf. The current patch [see attachment] should be much simpler and less intrusive than the previous one. Any comments ? Here are some thoughts. The first two bullets cover relatively minor issues, the rest are more important. - All of the mbuf pkthdr fields today have the same meaning no matter what the context. It is not clear what txringid's global meaning is. Is it even possible for driver foo to interpret it the same way as driver bar? What if the number of rings are different, or if the ring at the particular index for foo is setup differently than the ring at that same index for bar? You are attempting to influence the driver's txq selection and traditionally the mbuf's flowid has been used for this purpose. Have you considered allowing the user to set the flowid directly? And mark it as such via a new rsstype so the kernel will leave it alone. Hi, At work so to speak, we have tried to make a simple approach that will not break existing code, without trying to optimise the possibilities and reduce memory footprint. - uint32_t - m_flowid_t is plain gratuitous. Now we need to include mbuf.h in more places just to get this definition. What's the advantage of this? style(9) isn't too fond of typedefs either. Also, drivers *do* need to know the width of the flowid. At least lagg(4) looks at the high bits of the flowid (see flowid_shift in lagg). How high it can go depends on the width of the flowid. The flowid should be typedef'ed. Else how can you know its type passing flowid along function arguments and so on? It's just a simple 32 bit unsigned int and all drivers know exactly what it is. I don't think we need type checking for trivial stuff like this. We trust code to do the right thing and that's the correct tradeoff here, in my opinion. Or else we'd end up with errno_t, fd_t, etc. and programming in C would not be fun anymore. Here's a hyperbolic example: errno_t socket(domain_t domain, socktype_t type, protocol_t protocol); (oops, it returns an int -1 or 0 so errno_t is not strictly correct, but you get my point). - Interfaces can come and go, routes can change, and so the relationship between an inpcb and txringid is not stable at all. What happens when the outbound route for an inpcb changes? This is managed separately by a daemon or such. The problem about using the inpcb approach which you are suggesting, is that you limit the rate control feature to traffic which is bound by sockets. Can your way of doing rate control be useful to non-socket based firewall applications, for example? You also assume a 1:1 mapping between inpcb and the flowID, right. What about M:N mappings, where multiple streams should share the same flowID, because it makes more sense? You're right that an inpcb based scheme won't work for non-socket based firewall. inpcb represents an endpoint, almost always with an associated socket, and it mostly has a 1:1 relation with an n-tuple (SO_LISTEN and UDP sockets with no default destination are notable exceptions). If you're talking of non-socket based firewalls, then where is the inpcb coming from? Firewalls typically keep their own state for the n-tuples that they are interested in. It almost seems like you need a n-tuple - rate_limit mapping scheme instead of inpcb - rate_limit. Regards, Navdeep - The in_ratectlreq structure that you propose is inadequate in its current form. For example, cxgbe's hardware can do rate limiting on a per-ring as well as per-connection basis, and it allows for pps, bandwidth, or min-max limits. I think this is the critical piece that we NIC maintainers must agree on before any code hits the core kernel: how to express a rate-limit policy in a standard way and allow for hardware assistance
Re: [RFC] Add support for hardware transmit rate limiting queues [WAS: Add support for changing the flow ID of TCP connections]
- uint32_t - m_flowid_t is plain gratuitous. Now we need to include mbuf.h in more places just to get this definition. What's the advantage of this? style(9) isn't too fond of typedefs either. Also, drivers *do* need to know the width of the flowid. At least lagg(4) looks at the high bits of the flowid (see flowid_shift in lagg). How high it can go depends on the width of the flowid. The flowid should be typedef'ed. Else how can you know its type passing flowid along function arguments and so on? I agree with Navdeep. It's usage should be obvious from context. This just pollutes the namespace. - Interfaces can come and go, routes can change, and so the relationship between an inpcb and txringid is not stable at all. What happens when the outbound route for an inpcb changes? This is managed separately by a daemon or such. No it's not. Currently, unless you're using flowtables, the route and llentry are looked up for every single outbound packet. Most users are lightly enough loaded that they don't see the potential 8x reduction in pps from the added overhead. The problem about using the inpcb approach which you are suggesting, is that you limit the rate control feature to traffic which is bound by sockets. Can your way of doing rate control be useful to non-socket based firewall applications, for example? You also assume a 1:1 mapping between inpcb and the flowID, right. What about M:N mappings, where multiple streams should share the same flowID, because it makes more sense? That doesn't make any sense to me. FlowIDs are not a limited resource like 8-bit ASIDs where clever resource management was required. An M:N mapping would permit arbitrary interleaving of multiple streams which simply doesn't seem useful unless there is some critical case where it is a huge performance win. In general it is adding complexity for a gratuitous generalization. - The in_ratectlreq structure that you propose is inadequate in its current form. For example, cxgbe's hardware can do rate limiting on a per-ring as well as per-connection basis, and it allows for pps, bandwidth, or min-max limits. I think this is the critical piece that we NIC maintainers must agree on before any code hits the core kernel: how to express a rate-limit policy in a standard way and allow for hardware assistance opportunistically. ipfw(4)'s dummynet is probably interested in this part too, so it's great that Luigi is paying attention to this thread. My in_ratectlreq is a work in progress. Which means that it probably makes sense to not impinge upon the core system until it is more refined. - The RATECTL ioctls deal with in_ratectlreq so we need to standardize the ratectlreq structure before these ioctls can be considered generic ifnet ioctls. This is the reason cxgbetool (and not ifconfig) has a private ioctl to frob cxgbe's per-queue rate-limiters. I did not want to add ifnet ioctls that in reality were cxgbe only. Ditto for i2c ioctls. Now we have multiple drivers with i2c and melifaro@ is doing the right thing by promoting these private ioctls to a standard ifnet ioctl. Have you considered a private mlxtool as a stop gap measure? It might end that we need to create our own tool for this, having vendor specific IOCTLs, if we cannot agree how to do this in a general way. To summarize my take on all of this: we need a standard ratectlreq structure, Agree. a standard way to associate an inpcb with one, Maybe. Associating it with an inpcb doesn't exclude adding a mechanism for supporting it in firewalls. -K ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
gbde destroy doesn't match man page?
Hi, Playing with GBDE for my FreeBSD disk book, on: # uname -a FreeBSD storm 11.0-CURRENT FreeBSD 11.0-CURRENT #6 r269010: Wed Jul 23 11:13:17 EDT 2014 mwlucas@storm:/usr/obj/usr/src/sys/GENERIC amd64 According to the man page, I should be able to destroy all copies of the key with gbde destroy device -n -1. It's in the examples. When I try it I get: # gbde destroy da0p1 -n -1 gbde: illegal option -- n usage: gbde attach destination [-k keyfile] [-l lockfile] [-p pass-phrase] gbde detach destination gbde init destination [-i] [-f filename] [-K new-keyfile] [-L new-lockfile] [-P new-pass-phrase] gbde setkey destination [-n key] [-k keyfile] [-l lockfile] [-p pass-phrase] [-K new-keyfile] [-L new-lockfile] [-P new-pass-phrase] gbde nuke destination [-n key] [-k keyfile] [-l lockfile] [-p pass-phrase] gbde destroy destination [-k keyfile] [-l lockfile] [-p pass-phrase] Anyone know if this is a software bug or a doc bug? Thanks, ==ml -- Michael W. Lucas - mwlu...@michaelwlucas.com, Twitter @mwlauthor http://www.MichaelWLucas.com/, http://blather.MichaelWLucas.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: Remove pty(4)
Sam Fourman Jr. On Aug 20, 2014 1:00 PM, Davide Italiano dav...@freebsd.org wrote: One of my personal goals for 11 is to get rid of cloning mechanism entirely, and pty(4) is one of the few in-kernel drivers still relying on such mechanism. It's not possible, at least to my understanding, converting pty(4) to cdevpriv(9) as happened with other drivers. This is mainly because we always need a pair of devices (/dev/ptyXX and /dev/ttyXX) and userspace loops over ptyXX and after it successfully opens it tries to open the other one with the same suffix. So, having a single device is not really enough. My option, instead, is that of removing pty(4), which is nothing more than a compatibility driver, and move pmtx(4) code somewhere else. The main drawback of the removal of this is that it makes impossible to run FreeBSD = 7 jails and SSH into them. I personally don't consider this a huge issue, in light of the fact that FreeBSD-7 has been EOL for a long time, but I would like to hear other people comments. The code review for the proposed change can be found here: https://reviews.freebsd.org/D659 If I won't get any objection I'll commit this in one week time, i.e. August 27th. -- Davide I am all for the advancement of FreeBSD, but I for one maintain appliance products based on 7.x, most of the time I vote for out with the old in with the new... But are we certain all options for keeping compat have been explored? Just my 2c Sam Fourman Jr. There are no solved problems; there are only problems that are more or less solved -- Henri Poincare ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org