Re: [RFC] Allow m_dup() to use JUMBO clusters
Hi, Would it be better if my patch used the PAGE_SIZE clusters instead of the 16K ones? Then it should not be affected by memory defragmentation. Thanks for shedding some light into this area? --HPS Hi, Updated patch attached. --HPS === sys/kern/uipc_mbuf.c == --- sys/kern/uipc_mbuf.c (revision 268358) +++ sys/kern/uipc_mbuf.c (local) @@ -917,7 +917,15 @@ struct mbuf *n; /* Get the next new mbuf */ - if (remain = MINCLSIZE) { + if (remain = MJUMPAGESIZE) { + /* + * By allocating a bigger mbuf, we get fewer + * scatter gather entries for the hardware to + * process: + */ + n = m_getjcl(how, m-m_type, 0, MJUMPAGESIZE); + nsize = MJUMPAGESIZE; + } else if (remain = MINCLSIZE) { n = m_getcl(how, m-m_type, 0); nsize = MCLBYTES; } else { ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
ATTN: [zfs boot]: ZFS: unsupported compression algorithm 15
Hi All, Just FIY since nothing relevant was found at google. I was upgrading my CURRENT system to rev r268233 from a one-or-two weeks old system. The system was created years ago and had rather old zfsboot code. So, after upgrading and rebooting I got the error right after BIOS POST...: - ZFS: unsupported compression algorithm 15 - ... and instant reboot. OK, I've booted from a USB stick, run the command (the system in question is at /dev/ada1 and the system at USB stick is rather new CURRENT also): - # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1 - Now my system is fine again: - % uname -a FreeBSD bsam.int.wart.ru 11.0-CURRENT FreeBSD 11.0-CURRENT #73 r268233: Fri Jul 4 06:41:28 SAMT 2014 b...@bsam.int.wart.ru:/usr/obj/usr/src/sys/BB64X amd64 - -- WBR, Boris Samorodov (bsam) FreeBSD Committer, http://www.FreeBSD.org The Power To Serve ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ATTN: [zfs boot]: ZFS: unsupported compression algorithm 15
On 07/08/2014 10:47, Boris Samorodov wrote: Hi All, Just FIY since nothing relevant was found at google. I was upgrading my CURRENT system to rev r268233 from a one-or-two weeks old system. The system was created years ago and had rather old zfsboot code. So, after upgrading and rebooting I got the error right after BIOS POST...: - ZFS: unsupported compression algorithm 15 - ... and instant reboot. OK, I've booted from a USB stick, run the command (the system in question is at /dev/ada1 and the system at USB stick is rather new CURRENT also): - # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1 - Now my system is fine again: - % uname -a FreeBSD bsam.int.wart.ru 11.0-CURRENT FreeBSD 11.0-CURRENT #73 r268233: Fri Jul 4 06:41:28 SAMT 2014 b...@bsam.int.wart.ru:/usr/obj/usr/src/sys/BB64X amd64 - Did you do a 'zpool upgrade' when you updated your system? The new features shouldn't be enabled without you having done that. When you DO 'zpool upgrade' it specifically warns you to update the boot code for this reason. -- Allan Jude ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: vidcontrol(1) complains about Bad magic, in base/head, amd64, sc console, r268165
On Sun, 6 Jul 2014 16:33+0300, Aleksandr Rybalko wrote: Hi, so if i get it right, you get expected results, right? If so, please check key combinations which is different, to get correct results. And if all is ok, send me new maps please. If it is not correct, let as know what is wrong. Thanks a lot! Sorry for the delay. I followed my heart and looked carefully at the Norwegian layout of my keyboard. The layout is fairly updated, although it has the wavey Windows symbol and not the trapeze shaped Windows symbol you'll find nowadays. ;-) I decided it would be nice to have access to the euro symbol not on its current location as the letter é has historically been reached by holding AltGr and hitting the lowercase e key. Instead, I figured it made some sense to have the euro symbol available on AltGr+Shift+4. That's where you'll find the 8364 value, line 11. The remainder is virtually unchanged, except for a correction of whitespace on line 109. A tab was changed into three spaces. Feel free to use a better, suitable name for the attached keymap. I know there are some Norwegian speaking individuals on this list, and I welcome their input. -- +---++ | Vennlig hilsen, | Best regards, | | Trond Endrestøl, | Trond Endrestøl, | | IT-ansvarlig, | System administrator, | | Fagskolen Innlandet, | Gjøvik Technical College, Norway, | | tlf. mob. 952 62 567, | Cellular...: +47 952 62 567, | | sentralbord 61 14 54 00. | Switchboard: +47 61 14 54 00. | +---++# $FreeBSD: head/share/syscons/keymaps/norwegian.iso.kbd 117271 2003-07-06 03:09:40Z ache $ # alt # scan cntrl altalt cntrl lock # code base shift cntrl shift altshift cntrl shift state # -- 000 nopnopnopnopnopnopnopnop O 001 escescescescescescdebug esc O 002 '1''!'nopnop'1''!'nopnop O 003 '2'''nulnul'@''@'nulnul O 004 '3''#'nopnop163'#'nopnop O 005 '4'164nopnop'$'8364 nopnop O 006 '5''%'nopnop'5''%'nopnop O 007 '6'''nopnop'6'''nopnop O 008 '7''/'nopnop'{''/'nopnop O 009 '8''('escesc'[''('escesc O 010 '9'')'gs gs ']'')'gs gs O 011 '0''='nopnop'}''='nopnop O 012 '+''?'nopnop'+''?'nopnop O 013 '\''`'fs fs '''nopnopnop O 014 bs bs deldelbs bs deldel O 015 ht btab nopnopht btab nopnop O 016 'q''Q'dc1dc1'q''Q'dc1dc1 C 017 'w''W'etbetb'w''W'etbetb C 018 'e''E'enqenq233201enqenq C 019 'r''R'dc2dc2174174dc2dc2 C 020 't''T'dc4dc4254222dc4dc4 C 021 'y''Y'em em 255165em em C 022 'u''U'naknak252220naknak C 023 'i''I'ht ht 239207ht ht C 024 'o''O'si si 242210si si C 025 'p''P'dledle182182dledle C 026 229197nopnop'}'']'nopnop C 027 168'^'rs rs '~''^'rs rs O 028 cr cr nl nl cr cr nl nl O 029 lctrl lctrl lctrl lctrl lctrl lctrl lctrl lctrl O 030 'a''A'sohsoh225193sohsoh C 031 's''S'dc3dc3223223dc3dc3 C 032 'd''D'eoteot240208eoteot C 033 'f''F'ackack170170ackack C 034 'g''G'belbel'g''G'belbel C 035 'h''H'bs bs 'h''H'bs bs C 036 'j''J'nl nl 'j''J'nl nl C 037 'k''K'vt vt 'k''K'vt vt C 038 'l''L'ff ff 'l''L'ff ff C 039 248216nopnop'|''\'nopnop C 040 230198nopnop'{''['nopnop C 041 '|'167nopnop166182nopnop O 042 lshift lshift lshift lshift lshift lshift lshift lshift O
[RFC] Add support for changing the flow ID of TCP connections
Hi, I'm working on a new feature which will allow TCP connections to be timing controlled by the ethernet hardware driver, actually the mlxen driver. The main missing piece in the kernel is to allow the mbuf's flowid value to be overwritten in struct inpcb once the connection is established and to have a callback once the TCP connection is gone so that the assigned flowid can be freed by the ethernet hardware driver. The flowid will be used to assign the outgoing data traffic of a specific TCP connections to a hardware controlled queue, which in advance contain certain parameters about the timing for the transmitted packets. To be able to set the flowid I'm using existing functions in the kernel TCP code to lookup the inpcb structure based on the 4-tuple, via the ifp-if_ioctl() callback of the network adapter. I'm also registering a function method table so that I get a callback when the TCP connection is gone. A this point of development I would like to get some feedback from FreeBSD network guys about my attached patch proposal. The motivation for this work is to have a more reliable TCP transmissions typically for fixed-rate media content going some distance. To illustrate this I will give you an example from the world of VoIP, which is using UDP. When doing long-distance VoIP calls through various unknown networks and routers it makes a very big difference if you are sending data 20ms apart or 40ms apart, even at the exact same rate. In the one case you might experience a bunch of packet drops, and in the other case, everything is fine. Why? Because the number of packets you send per second, and the timing is important. The goal is to apply some timing rules for TCP, to increase the factor of successful transmission, and to reduce the amount of data loss. For high throughput applications we want to do this by means of hardware. While at it I would like to typedef the flowid used by mbufs, struct inpcb and many more places. Where would the right place be to put such a definition? In sys/mbuf.h? Comments are appreciated! --HPS === sys/netinet/in_pcb.c == --- sys/netinet/in_pcb.c (revision 268358) +++ sys/netinet/in_pcb.c (local) @@ -1173,6 +1173,100 @@ } /* + * in_pcb_handle_ratectlreq - this function sets the hardware flow ID + * for a given IPv4 connection based on the input arguments. + * + * Return values: + * 0: Success + * Non-zero: Failure + */ +int +in_pcb_handle_ratectlreq(struct ifnet *ifp, struct in_ratectlreq *req, +const struct in_flowid_methods *mtod, void *arg) +{ + struct inpcb *inp; + int error; + + if (ifp == NULL || req == NULL || mtod == NULL || + mtod-inf_alloc == NULL || mtod-inf_rateset == NULL || + mtod-inf_free == NULL) + return (EINVAL); + + inp = in_pcblookup(V_tcbinfo, + req-ifreq_dst.sin_addr, req-ifreq_dst.sin_port, + req-ifreq_src.sin_addr, req-ifreq_src.sin_port, + INPLOOKUP_WLOCKPCB, ifp); + if (inp == NULL) + return (ENOENT); + + INP_WLOCK_ASSERT(inp); + + if (inp-inp_flowid_mtod == NULL) { + error = mtod-inf_alloc(arg, inp-inp_flowid); + if (error != 0) + goto done; + inp-inp_flowid_mtod = mtod; + inp-inp_flowid_arg = arg; + /* ensure that the flow ID is not overwritten */ + inp-inp_flags |= INP_HW_FLOWID; + inp-inp_flags = ~INP_SW_FLOWID; + inp-inp_flowtype = M_HASHTYPE_NONE; + } + error = inp-inp_flowid_mtod-inf_rateset(inp-inp_flowid_arg, + inp-inp_flowid, req-ifreq_baudrate); +done: + INP_WUNLOCK(inp); + return (error); +} + +/* + * in6_pcb_handle_ratectlreq - this function sets the hardware flow ID + * for a given IPv6 connection based on the input arguments. + * + * Return values: + * 0: Success + * Non-zero: Failure + */ +int +in6_pcb_handle_ratectlreq(struct ifnet *ifp, struct in6_ratectlreq *req, +const struct in_flowid_methods *mtod, void *arg) +{ + struct inpcb *inp; + int error; + + if (ifp == NULL || req == NULL || mtod == NULL || + mtod-inf_alloc == NULL || mtod-inf_rateset == NULL || + mtod-inf_free == NULL) + return (EINVAL); + + inp = in6_pcblookup(V_tcbinfo, + req-ifreq_dst.sin6_addr, req-ifreq_dst.sin6_port, + req-ifreq_src.sin6_addr, req-ifreq_src.sin6_port, + INPLOOKUP_WLOCKPCB, ifp); + if (inp == NULL) + return (ENOENT); + + INP_WLOCK_ASSERT(inp); + + if (inp-inp_flowid_mtod == NULL) { + error = mtod-inf_alloc(arg, inp-inp_flowid); + if (error != 0) + goto done; + inp-inp_flowid_mtod = mtod; + inp-inp_flowid_arg = arg; + /* ensure that the flow ID is not overwritten */ + inp-inp_flags |= INP_HW_FLOWID; + inp-inp_flags = ~INP_SW_FLOWID; + inp-inp_flowtype = M_HASHTYPE_NONE; + } + error = inp-inp_flowid_mtod-inf_rateset(inp-inp_flowid_arg, + inp-inp_flowid, req-ifreq_baudrate); +done: + INP_WUNLOCK(inp); + return (error); +} + +/* * Unconditionally schedule an inpcb to be freed by decrementing its * reference count, which should occur only after the inpcb
Re: ATTN: [zfs boot]: ZFS: unsupported compression algorithm 15
08.07.2014 19:25, Allan Jude пишет: On 07/08/2014 10:47, Boris Samorodov wrote: Hi All, Just FIY since nothing relevant was found at google. I was upgrading my CURRENT system to rev r268233 from a one-or-two weeks old system. The system was created years ago and had rather old zfsboot code. So, after upgrading and rebooting I got the error right after BIOS POST...: - ZFS: unsupported compression algorithm 15 - ... and instant reboot. OK, I've booted from a USB stick, run the command (the system in question is at /dev/ada1 and the system at USB stick is rather new CURRENT also): - # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1 - Now my system is fine again: - % uname -a FreeBSD bsam.int.wart.ru 11.0-CURRENT FreeBSD 11.0-CURRENT #73 r268233: Fri Jul 4 06:41:28 SAMT 2014 b...@bsam.int.wart.ru:/usr/obj/usr/src/sys/BB64X amd64 - Did you do a 'zpool upgrade' when you updated your system? The new features shouldn't be enabled without you having done that. Nope. My commands (from remote console): - # make -C /usr/src installkernel # make -C /usr/src installworld # mergemaster # make -C /usr/src delete-old # make -C /usr/src delete-old-libs # shutdown -r now exit BOOM! - When you DO 'zpool upgrade' it specifically warns you to update the boot code for this reason. -- WBR, Boris Samorodov (bsam) FreeBSD Committer, http://www.FreeBSD.org The Power To Serve ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for changing the flow ID of TCP connections
On 07/08/14 10:46, Hans Petter Selasky wrote: Hi, I'm working on a new feature which will allow TCP connections to be timing controlled by the ethernet hardware driver, actually the mlxen driver. The main missing piece in the kernel is to allow the mbuf's flowid value to be overwritten in struct inpcb once the connection is established and to have a callback once the TCP connection is gone so that the assigned flowid can be freed by the ethernet hardware driver. The flowid will be used to assign the outgoing data traffic of a specific TCP connections to a hardware controlled queue, which in advance contain certain parameters about the timing for the transmitted packets. To be able to set the flowid I'm using existing functions in the kernel TCP code to lookup the inpcb structure based on the 4-tuple, via the ifp-if_ioctl() callback of the network adapter. I'm also registering a function method table so that I get a callback when the TCP connection is gone. A this point of development I would like to get some feedback from FreeBSD network guys about my attached patch proposal. The motivation for this work is to have a more reliable TCP transmissions typically for fixed-rate media content going some distance. To illustrate this I will give you an example from the world of VoIP, which is using UDP. When doing long-distance VoIP calls through various unknown networks and routers it makes a very big difference if you are sending data 20ms apart or 40ms apart, even at the exact same rate. In the one case you might experience a bunch of packet drops, and in the other case, everything is fine. Why? Because the number of packets you send per second, and the timing is important. The goal is to apply some timing rules for TCP, to increase the factor of successful transmission, and to reduce the amount of data loss. For high throughput applications we want to do this by means of hardware. While at it I would like to typedef the flowid used by mbufs, struct inpcb and many more places. Where would the right place be to put such a definition? In sys/mbuf.h? Comments are appreciated! I think we need to design this to be as generic as possible. I have quite a bit of code that does this stuff but I haven't pushed it upstream or even offered it for review (yet). cxgbe(4) hardware does throttling and traffic pacing too, but it's not limited to TCP, and it can do it per queue or per flow -- you can limit a tx queue or an individual flow to a packet-per-second limit or a bandwidth ceiling; this works for both plain NIC (TCP, UDP, whatever), as well as stateful TCP offload). For TCP (NIC or TOE) the chip can even rewrite the TCP timestamp to account for the extra time that the chip/driver held the packet because it was asked to slow down a flow. The per queue stuff is handled via a driver-specific tool (cxgbetool). For per-flow throttling my implementation adds a new sockopt (SO_TX_THROTTLE) that lets an application specify a throttle rate for a socket. The kernel allocates a flow identifier for each such socket and tcp_output (or udp_output, ..) will attach an mbuf tag containing this identifier and throttling parameters to each mbuf that it pushes out. Drivers for hardware that can throttle traffic look for this tag, the rest ignore it. - cxgbe(4) registers itself as a flow throttling provider with the kernel when it attaches to the chip. It tells the kernel how many flows it can handle and the range of rates it can handle. - setsockopt(SO_TX_THROTTLE, rate) makes the kernel allocate a unique identifier for the socket. This is *not* related to the RSS flowid at all. If a listening socket has SO_TX_THROTTLE, all its children will inherit the rate limiting parameters but will each get its own unique identifier. The setsockopt fails if there aren't any flow throttling providers registered, - tcp_output (and other proto_output) routines look for SO_TX_THROTTLE and attach extra metadata, in the form of a tag, to the outgoing frames. - cxgbe(4) reads this metadata and acts on it. Regards, Navdeep ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Add support for changing the flow ID of TCP connections
Hi! The flowid value has way, way too many possible meanings but it's always been a mostly-static value. I'm worried about overriding it with multiple meanings that cause features to not work at all together. So I'd rather leave the flowid/flowtype as it currently is so it doesn't upset packet reordering and can be used by things like RSS for scaling, and instead introduce a new connection ID to be used for your purpose. That way the existing use of flowid for packet ordering and flowid/flowtype for doing network scaling and netisr selection can work together with your connection id requirements. Having stack support for hardware/firmware packet scheduling is cool. It seems to somewhat overlap with other parts of the TCP offload though and I'm concerned about bloating out inpcb by 3 pointers for each connection where lots of connections on the same NIC will point to the same function set or NULL. I'd hit up what others in this space are doing. There's pacing support in the chelsio NIC for example and I'm not sure what Navdeep's plans are for that in upstream FreeBSD. Other than that, cool! -a On 8 July 2014 10:46, Hans Petter Selasky h...@selasky.org wrote: Hi, I'm working on a new feature which will allow TCP connections to be timing controlled by the ethernet hardware driver, actually the mlxen driver. The main missing piece in the kernel is to allow the mbuf's flowid value to be overwritten in struct inpcb once the connection is established and to have a callback once the TCP connection is gone so that the assigned flowid can be freed by the ethernet hardware driver. The flowid will be used to assign the outgoing data traffic of a specific TCP connections to a hardware controlled queue, which in advance contain certain parameters about the timing for the transmitted packets. To be able to set the flowid I'm using existing functions in the kernel TCP code to lookup the inpcb structure based on the 4-tuple, via the ifp-if_ioctl() callback of the network adapter. I'm also registering a function method table so that I get a callback when the TCP connection is gone. A this point of development I would like to get some feedback from FreeBSD network guys about my attached patch proposal. The motivation for this work is to have a more reliable TCP transmissions typically for fixed-rate media content going some distance. To illustrate this I will give you an example from the world of VoIP, which is using UDP. When doing long-distance VoIP calls through various unknown networks and routers it makes a very big difference if you are sending data 20ms apart or 40ms apart, even at the exact same rate. In the one case you might experience a bunch of packet drops, and in the other case, everything is fine. Why? Because the number of packets you send per second, and the timing is important. The goal is to apply some timing rules for TCP, to increase the factor of successful transmission, and to reduce the amount of data loss. For high throughput applications we want to do this by means of hardware. While at it I would like to typedef the flowid used by mbufs, struct inpcb and many more places. Where would the right place be to put such a definition? In sys/mbuf.h? Comments are appreciated! --HPS ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Allow m_dup() to use JUMBO clusters
Hans Petter Selasky wrote: Hi, Would it be better if my patch used the PAGE_SIZE clusters instead of the 16K ones? Then it should not be affected by memory defragmentation. Thanks for shedding some light into this area? Well, I ran into the threads stuck on btalloc when I used PAGE_SIZE clusters mixed with MCLBYTES clusters and from what I could figure, it was a kernel address space fragmentation issue. I would guess that PAGE_SIZE clusters aren't as bad as 16K clusters w.r.t. fragmentation, but I believe that they could still be an issue. (My testing was on a 256Mbyte i386, so I can't say if amd64 systems will have a problem, just that small 32bit arches will.) rick --HPS Hi, Updated patch attached. --HPS ___ freebsd-...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Allow m_dup() to use JUMBO clusters
John-Mark Gurney wrote: Hans Petter Selasky wrote this message on Mon, Jul 07, 2014 at 10:12 +0200: I'm asking for some input on the attached m_dup() patch, so that existing functionality or dependencies are not broken. The background for the change is to allow m_dup() to defrag long mbuf chains that doesn't fit into a specific hardware's scatter gather entries, typically when doing TSO. In my case the HW limit is 16 entries of length 4K for doing a 64KByte TSO packet. Currently m_dup() is at best producing 32 entries of each 2K for a 64Kbytes TSO packet. By allowing m_dup() to get JUMBO clusters when allocating mbufs, we avoid creating a new function, specific to the hardware, to defrag some rare-occurring very long mbuf chains into a mbuf chain below 16 entries. Please no... Until we get a better allocator, we should not use jumbo (page sized) mbufs otherwise we will quickly fail to allocate mbufs after a machine has been up for a long while causing other failures... Unless of course if the code fails to allocate the largest cluster it falls through to trying to allocate the next smaller size, that might be better... Unfortunately, for the can't allocate boundary tags case, the allocation request with M_NOWAIT loops instead of failing. I tried: m = m_getjcl(M_NOWAIT..M_JUMPAGESIZE); if (m == NULL) m = getjcl(M_WAITOK..MCLBYTES); when I was experimenting with MJUMPAGESIZE clusters for NFS and what happened was the thread looped in the first m_getjcl() instead of returning NULL. It is about 12 layers of function calls deep and most fail/return NULL, but somewhere one of them decides to try again. I didn't locate the location of that and don't know if it would be safe to change it so that m_getjcl() returns NULL for this case. rick -- John-Mark GurneyVoice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org