Re: [RFC] Allow m_dup() to use JUMBO clusters

2014-07-08 Thread Hans Petter Selasky


Hi,

Would it be better if my patch used the PAGE_SIZE clusters instead of
the 16K ones? Then it should not be affected by memory defragmentation.
Thanks for shedding some light into this area?

--HPS



Hi,

Updated patch attached.

--HPS
=== sys/kern/uipc_mbuf.c
==
--- sys/kern/uipc_mbuf.c	(revision 268358)
+++ sys/kern/uipc_mbuf.c	(local)
@@ -917,7 +917,15 @@
 		struct mbuf *n;
 
 		/* Get the next new mbuf */
-		if (remain = MINCLSIZE) {
+		if (remain = MJUMPAGESIZE) {
+			/*
+			 * By allocating a bigger mbuf, we get fewer
+			 * scatter gather entries for the hardware to
+			 * process:
+			 */
+			n = m_getjcl(how, m-m_type, 0, MJUMPAGESIZE);
+			nsize = MJUMPAGESIZE;
+		} else if (remain = MINCLSIZE) {
 			n = m_getcl(how, m-m_type, 0);
 			nsize = MCLBYTES;
 		} else {
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

ATTN: [zfs boot]: ZFS: unsupported compression algorithm 15

2014-07-08 Thread Boris Samorodov
Hi All,

Just FIY since nothing relevant was found at google.

I was upgrading my CURRENT system to rev r268233 from a one-or-two
weeks old system. The system was created years ago and had rather
old zfsboot code. So, after upgrading and rebooting I got the error
right after BIOS POST...:
-
ZFS: unsupported compression algorithm 15
-

... and instant reboot.

OK, I've booted from a USB stick, run the command (the system in
question is at /dev/ada1 and the system at USB stick is rather new
CURRENT also):
-
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
-

Now my system is fine again:
-
% uname -a

FreeBSD bsam.int.wart.ru 11.0-CURRENT FreeBSD 11.0-CURRENT #73 r268233:
Fri Jul  4 06:41:28 SAMT 2014
b...@bsam.int.wart.ru:/usr/obj/usr/src/sys/BB64X  amd64
-

-- 
WBR, Boris Samorodov (bsam)
FreeBSD Committer, http://www.FreeBSD.org The Power To Serve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ATTN: [zfs boot]: ZFS: unsupported compression algorithm 15

2014-07-08 Thread Allan Jude
On 07/08/2014 10:47, Boris Samorodov wrote:
 Hi All,
 
 Just FIY since nothing relevant was found at google.
 
 I was upgrading my CURRENT system to rev r268233 from a one-or-two
 weeks old system. The system was created years ago and had rather
 old zfsboot code. So, after upgrading and rebooting I got the error
 right after BIOS POST...:
 -
 ZFS: unsupported compression algorithm 15
 -
 
 ... and instant reboot.
 
 OK, I've booted from a USB stick, run the command (the system in
 question is at /dev/ada1 and the system at USB stick is rather new
 CURRENT also):
 -
 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
 -
 
 Now my system is fine again:
 -
 % uname -a
 
 FreeBSD bsam.int.wart.ru 11.0-CURRENT FreeBSD 11.0-CURRENT #73 r268233:
 Fri Jul  4 06:41:28 SAMT 2014
 b...@bsam.int.wart.ru:/usr/obj/usr/src/sys/BB64X  amd64
 -
 

Did you do a 'zpool upgrade' when you updated your system? The new
features shouldn't be enabled without you having done that.

When you DO 'zpool upgrade' it specifically warns you to update the boot
code for this reason.

-- 
Allan Jude
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: vidcontrol(1) complains about Bad magic, in base/head, amd64, sc console, r268165

2014-07-08 Thread Trond Endrestøl
On Sun, 6 Jul 2014 16:33+0300, Aleksandr Rybalko wrote:

 Hi,
 
 so if i get it right, you get expected results, right?
 
 If so, please check key combinations which is different, to get correct 
 results.
 And if all is ok, send me new maps please.
 If it is not correct, let as know what is wrong.
 
 Thanks a lot!

Sorry for the delay.

I followed my heart and looked carefully at the Norwegian layout of my 
keyboard. The layout is fairly updated, although it has the wavey 
Windows symbol and not the trapeze shaped Windows symbol you'll find 
nowadays. ;-)

I decided it would be nice to have access to the euro symbol not on 
its current location as the letter é has historically been reached by 
holding AltGr and hitting the lowercase e key.

Instead, I figured it made some sense to have the euro symbol 
available on AltGr+Shift+4. That's where you'll find the 8364 value, 
line 11.

The remainder is virtually unchanged, except for a correction of 
whitespace on line 109. A tab was changed into three spaces.

Feel free to use a better, suitable name for the attached keymap.

I know there are some Norwegian speaking individuals on this list, and 
I welcome their input.

-- 
+---++
| Vennlig hilsen,   | Best regards,  |
| Trond Endrestøl,  | Trond Endrestøl,   |
| IT-ansvarlig, | System administrator,  |
| Fagskolen Innlandet,  | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,   | Cellular...: +47 952 62 567,   |
| sentralbord 61 14 54 00.  | Switchboard: +47 61 14 54 00.  |
+---++# $FreeBSD: head/share/syscons/keymaps/norwegian.iso.kbd 117271 2003-07-06 
03:09:40Z ache $
# alt
# scan   cntrl  altalt   cntrl lock
# code  base   shift  cntrl  shift  altshift  cntrl  shift state
# --
  000   nopnopnopnopnopnopnopnop O
  001   escescescescescescdebug  esc O
  002   '1''!'nopnop'1''!'nopnop O
  003   '2'''nulnul'@''@'nulnul O
  004   '3''#'nopnop163'#'nopnop O
  005   '4'164nopnop'$'8364   nopnop O
  006   '5''%'nopnop'5''%'nopnop O
  007   '6'''nopnop'6'''nopnop O
  008   '7''/'nopnop'{''/'nopnop O
  009   '8''('escesc'[''('escesc O
  010   '9'')'gs gs ']'')'gs gs  O
  011   '0''='nopnop'}''='nopnop O
  012   '+''?'nopnop'+''?'nopnop O
  013   '\''`'fs fs '''nopnopnop O
  014   bs bs deldelbs bs deldel O
  015   ht btab   nopnopht btab   nopnop O
  016   'q''Q'dc1dc1'q''Q'dc1dc1 C
  017   'w''W'etbetb'w''W'etbetb C
  018   'e''E'enqenq233201enqenq C
  019   'r''R'dc2dc2174174dc2dc2 C
  020   't''T'dc4dc4254222dc4dc4 C
  021   'y''Y'em em 255165em em  C
  022   'u''U'naknak252220naknak C
  023   'i''I'ht ht 239207ht ht  C
  024   'o''O'si si 242210si si  C
  025   'p''P'dledle182182dledle C
  026   229197nopnop'}'']'nopnop C
  027   168'^'rs rs '~''^'rs rs  O
  028   cr cr nl nl cr cr nl nl  O
  029   lctrl  lctrl  lctrl  lctrl  lctrl  lctrl  lctrl  lctrl   O
  030   'a''A'sohsoh225193sohsoh C
  031   's''S'dc3dc3223223dc3dc3 C
  032   'd''D'eoteot240208eoteot C
  033   'f''F'ackack170170ackack C
  034   'g''G'belbel'g''G'belbel C
  035   'h''H'bs bs 'h''H'bs bs  C
  036   'j''J'nl nl 'j''J'nl nl  C
  037   'k''K'vt vt 'k''K'vt vt  C
  038   'l''L'ff ff 'l''L'ff ff  C
  039   248216nopnop'|''\'nopnop C
  040   230198nopnop'{''['nopnop C
  041   '|'167nopnop166182nopnop O
  042   lshift lshift lshift lshift lshift lshift lshift lshift  O
  

[RFC] Add support for changing the flow ID of TCP connections

2014-07-08 Thread Hans Petter Selasky

Hi,

I'm working on a new feature which will allow TCP connections to be 
timing controlled by the ethernet hardware driver, actually the mlxen 
driver. The main missing piece in the kernel is to allow the mbuf's 
flowid value to be overwritten in struct inpcb once the connection is 
established and to have a callback once the TCP connection is gone so 
that the assigned flowid can be freed by the ethernet hardware driver.


The flowid will be used to assign the outgoing data traffic of a 
specific TCP connections to a hardware controlled queue, which in 
advance contain certain parameters about the timing for the transmitted 
packets.


To be able to set the flowid I'm using existing functions in the kernel 
TCP code to lookup the inpcb structure based on the 4-tuple, via the 
ifp-if_ioctl() callback of the network adapter. I'm also registering 
a function method table so that I get a callback when the TCP connection 
is gone.


A this point of development I would like to get some feedback from 
FreeBSD network guys about my attached patch proposal.


The motivation for this work is to have a more reliable TCP 
transmissions typically for fixed-rate media content going some 
distance. To illustrate this I will give you an example from the world 
of VoIP, which is using UDP. When doing long-distance VoIP calls through 
various unknown networks and routers it makes a very big difference if 
you are sending data 20ms apart or 40ms apart, even at the exact same 
rate. In the one case you might experience a bunch of packet drops, and 
in the other case, everything is fine. Why? Because the number of 
packets you send per second, and the timing is important. The goal is to 
apply some timing rules for TCP, to increase the factor of successful 
transmission, and to reduce the amount of data loss. For high throughput 
applications we want to do this by means of hardware.



While at it I would like to typedef the flowid used by mbufs, struct 
inpcb and many more places.  Where would the right place be to put such 
a definition? In sys/mbuf.h?



Comments are appreciated!

--HPS
=== sys/netinet/in_pcb.c
==
--- sys/netinet/in_pcb.c	(revision 268358)
+++ sys/netinet/in_pcb.c	(local)
@@ -1173,6 +1173,100 @@
 }
 
 /*
+ * in_pcb_handle_ratectlreq - this function sets the hardware flow ID
+ * for a given IPv4 connection based on the input arguments.
+ *
+ * Return values:
+ * 0: Success
+ * Non-zero: Failure
+ */
+int
+in_pcb_handle_ratectlreq(struct ifnet *ifp, struct in_ratectlreq *req,
+const struct in_flowid_methods *mtod, void *arg)
+{
+	struct inpcb *inp;
+	int error;
+
+	if (ifp == NULL || req == NULL || mtod == NULL ||
+	mtod-inf_alloc == NULL || mtod-inf_rateset == NULL ||
+	mtod-inf_free == NULL)
+		return (EINVAL);
+
+	inp = in_pcblookup(V_tcbinfo,
+	req-ifreq_dst.sin_addr, req-ifreq_dst.sin_port,
+	req-ifreq_src.sin_addr, req-ifreq_src.sin_port,
+	INPLOOKUP_WLOCKPCB, ifp);
+	if (inp == NULL)
+		return (ENOENT);
+
+	INP_WLOCK_ASSERT(inp);
+
+	if (inp-inp_flowid_mtod == NULL) {
+		error = mtod-inf_alloc(arg, inp-inp_flowid);
+		if (error != 0)
+			goto done;
+		inp-inp_flowid_mtod = mtod;
+		inp-inp_flowid_arg = arg;
+		/* ensure that the flow ID is not overwritten */ 
+		inp-inp_flags |= INP_HW_FLOWID;
+		inp-inp_flags = ~INP_SW_FLOWID;
+		inp-inp_flowtype = M_HASHTYPE_NONE;
+	}
+	error = inp-inp_flowid_mtod-inf_rateset(inp-inp_flowid_arg,
+	inp-inp_flowid, req-ifreq_baudrate);
+done:
+	INP_WUNLOCK(inp);
+	return (error);
+}
+
+/*
+ * in6_pcb_handle_ratectlreq - this function sets the hardware flow ID
+ * for a given IPv6 connection based on the input arguments.
+ *
+ * Return values:
+ * 0: Success
+ * Non-zero: Failure
+ */
+int
+in6_pcb_handle_ratectlreq(struct ifnet *ifp, struct in6_ratectlreq *req,
+const struct in_flowid_methods *mtod, void *arg)
+{
+	struct inpcb *inp;
+	int error;
+
+	if (ifp == NULL || req == NULL || mtod == NULL ||
+	mtod-inf_alloc == NULL || mtod-inf_rateset == NULL ||
+	mtod-inf_free == NULL)
+		return (EINVAL);
+
+	inp = in6_pcblookup(V_tcbinfo,
+	req-ifreq_dst.sin6_addr, req-ifreq_dst.sin6_port,
+	req-ifreq_src.sin6_addr, req-ifreq_src.sin6_port,
+	INPLOOKUP_WLOCKPCB, ifp);
+	if (inp == NULL)
+		return (ENOENT);
+
+	INP_WLOCK_ASSERT(inp);
+
+	if (inp-inp_flowid_mtod == NULL) {
+		error = mtod-inf_alloc(arg, inp-inp_flowid);
+		if (error != 0)
+			goto done;
+		inp-inp_flowid_mtod = mtod;
+		inp-inp_flowid_arg = arg;
+		/* ensure that the flow ID is not overwritten */ 
+		inp-inp_flags |= INP_HW_FLOWID;
+		inp-inp_flags = ~INP_SW_FLOWID;
+		inp-inp_flowtype = M_HASHTYPE_NONE;
+	}
+	error = inp-inp_flowid_mtod-inf_rateset(inp-inp_flowid_arg,
+	inp-inp_flowid, req-ifreq_baudrate);
+done:
+	INP_WUNLOCK(inp);
+	return (error);
+}
+
+/*
  * Unconditionally schedule an inpcb to be freed by decrementing its
  * reference count, which should occur only after the inpcb 

Re: ATTN: [zfs boot]: ZFS: unsupported compression algorithm 15

2014-07-08 Thread Boris Samorodov
08.07.2014 19:25, Allan Jude пишет:
 On 07/08/2014 10:47, Boris Samorodov wrote:
 Hi All,

 Just FIY since nothing relevant was found at google.

 I was upgrading my CURRENT system to rev r268233 from a one-or-two
 weeks old system. The system was created years ago and had rather
 old zfsboot code. So, after upgrading and rebooting I got the error
 right after BIOS POST...:
 -
 ZFS: unsupported compression algorithm 15
 -

 ... and instant reboot.

 OK, I've booted from a USB stick, run the command (the system in
 question is at /dev/ada1 and the system at USB stick is rather new
 CURRENT also):
 -
 # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
 -

 Now my system is fine again:
 -
 % uname -a

 FreeBSD bsam.int.wart.ru 11.0-CURRENT FreeBSD 11.0-CURRENT #73 r268233:
 Fri Jul  4 06:41:28 SAMT 2014
 b...@bsam.int.wart.ru:/usr/obj/usr/src/sys/BB64X  amd64
 -

 
 Did you do a 'zpool upgrade' when you updated your system? The new
 features shouldn't be enabled without you having done that.

Nope. My commands (from remote console):
-
# make -C /usr/src installkernel
# make -C /usr/src installworld
# mergemaster
# make -C /usr/src delete-old
# make -C /usr/src delete-old-libs
# shutdown -r now  exit
BOOM!
-

 When you DO 'zpool upgrade' it specifically warns you to update the boot
 code for this reason.

-- 
WBR, Boris Samorodov (bsam)
FreeBSD Committer, http://www.FreeBSD.org The Power To Serve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC] Add support for changing the flow ID of TCP connections

2014-07-08 Thread Navdeep Parhar
On 07/08/14 10:46, Hans Petter Selasky wrote:
 Hi,
 
 I'm working on a new feature which will allow TCP connections to be
 timing controlled by the ethernet hardware driver, actually the mlxen
 driver. The main missing piece in the kernel is to allow the mbuf's
 flowid value to be overwritten in struct inpcb once the connection is
 established and to have a callback once the TCP connection is gone so
 that the assigned flowid can be freed by the ethernet hardware driver.
 
 The flowid will be used to assign the outgoing data traffic of a
 specific TCP connections to a hardware controlled queue, which in
 advance contain certain parameters about the timing for the transmitted
 packets.
 
 To be able to set the flowid I'm using existing functions in the kernel
 TCP code to lookup the inpcb structure based on the 4-tuple, via the
 ifp-if_ioctl() callback of the network adapter. I'm also registering
 a function method table so that I get a callback when the TCP connection
 is gone.
 
 A this point of development I would like to get some feedback from
 FreeBSD network guys about my attached patch proposal.
 
 The motivation for this work is to have a more reliable TCP
 transmissions typically for fixed-rate media content going some
 distance. To illustrate this I will give you an example from the world
 of VoIP, which is using UDP. When doing long-distance VoIP calls through
 various unknown networks and routers it makes a very big difference if
 you are sending data 20ms apart or 40ms apart, even at the exact same
 rate. In the one case you might experience a bunch of packet drops, and
 in the other case, everything is fine. Why? Because the number of
 packets you send per second, and the timing is important. The goal is to
 apply some timing rules for TCP, to increase the factor of successful
 transmission, and to reduce the amount of data loss. For high throughput
 applications we want to do this by means of hardware.
 
 
 While at it I would like to typedef the flowid used by mbufs, struct
 inpcb and many more places.  Where would the right place be to put such
 a definition? In sys/mbuf.h?
 
 
 Comments are appreciated!

I think we need to design this to be as generic as possible.  I have
quite a bit of code that does this stuff but I haven't pushed it
upstream or even offered it for review (yet).

cxgbe(4) hardware does throttling and traffic pacing too, but it's not
limited to TCP, and it can do it per queue or per flow -- you can
limit a tx queue or an individual flow to a packet-per-second limit or
a bandwidth ceiling; this works for both plain NIC (TCP, UDP, whatever),
as well as stateful TCP offload).  For TCP (NIC or TOE) the chip can
even rewrite the TCP timestamp to account for the extra time that the
chip/driver held the packet because it was asked to slow down a flow.

The per queue stuff is handled via a driver-specific tool (cxgbetool).

For per-flow throttling my implementation adds a new sockopt
(SO_TX_THROTTLE) that lets an application specify a throttle rate for a
socket.  The kernel allocates a flow identifier for each such socket
and tcp_output (or udp_output, ..) will attach an mbuf tag containing
this identifier and throttling parameters to each mbuf that it pushes
out.  Drivers for hardware that can throttle traffic look for this tag,
the rest ignore it.

- cxgbe(4) registers itself as a flow throttling provider with the
  kernel when it attaches to the chip.  It tells the kernel how many
  flows it can handle and the range of rates it can handle.
- setsockopt(SO_TX_THROTTLE, rate) makes the kernel allocate a unique
  identifier for the socket.  This is *not* related to the RSS flowid at
  all.  If a listening socket has SO_TX_THROTTLE, all its children will
  inherit the rate limiting parameters but will each get its own unique
  identifier.  The setsockopt fails if there aren't any flow throttling
  providers registered,
- tcp_output (and other proto_output) routines look for SO_TX_THROTTLE
  and attach extra metadata, in the form of a tag, to the outgoing
  frames.
- cxgbe(4) reads this metadata and acts on it.

Regards,
Navdeep

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFC] Add support for changing the flow ID of TCP connections

2014-07-08 Thread Adrian Chadd
Hi!

The flowid value has way, way too many possible meanings but it's
always been a mostly-static value. I'm worried about overriding it
with multiple meanings that cause features to not work at all
together.

So I'd rather leave the flowid/flowtype as it currently is so it
doesn't upset packet reordering and can be used by things like RSS for
scaling, and instead introduce a new connection ID to be used for your
purpose. That way the existing use of flowid for packet ordering and
flowid/flowtype for doing network scaling and netisr selection can
work together with your connection id requirements.

Having stack support for hardware/firmware packet scheduling is cool.
It seems to somewhat overlap with other parts of the TCP offload
though and I'm concerned about bloating out inpcb by 3 pointers for
each connection where lots of connections on the same NIC will point
to the same function set or NULL.

I'd hit up what others in this space are doing. There's pacing support
in the chelsio NIC for example and I'm not sure what Navdeep's plans
are for that in upstream FreeBSD.

Other than that, cool!



-a


On 8 July 2014 10:46, Hans Petter Selasky h...@selasky.org wrote:
 Hi,

 I'm working on a new feature which will allow TCP connections to be timing
 controlled by the ethernet hardware driver, actually the mlxen driver. The
 main missing piece in the kernel is to allow the mbuf's flowid value to be
 overwritten in struct inpcb once the connection is established and to have
 a callback once the TCP connection is gone so that the assigned flowid can
 be freed by the ethernet hardware driver.

 The flowid will be used to assign the outgoing data traffic of a specific
 TCP connections to a hardware controlled queue, which in advance contain
 certain parameters about the timing for the transmitted packets.

 To be able to set the flowid I'm using existing functions in the kernel TCP
 code to lookup the inpcb structure based on the 4-tuple, via the
 ifp-if_ioctl() callback of the network adapter. I'm also registering a
 function method table so that I get a callback when the TCP connection is
 gone.

 A this point of development I would like to get some feedback from FreeBSD
 network guys about my attached patch proposal.

 The motivation for this work is to have a more reliable TCP transmissions
 typically for fixed-rate media content going some distance. To illustrate
 this I will give you an example from the world of VoIP, which is using UDP.
 When doing long-distance VoIP calls through various unknown networks and
 routers it makes a very big difference if you are sending data 20ms apart or
 40ms apart, even at the exact same rate. In the one case you might
 experience a bunch of packet drops, and in the other case, everything is
 fine. Why? Because the number of packets you send per second, and the timing
 is important. The goal is to apply some timing rules for TCP, to increase
 the factor of successful transmission, and to reduce the amount of data
 loss. For high throughput applications we want to do this by means of
 hardware.


 While at it I would like to typedef the flowid used by mbufs, struct
 inpcb and many more places.  Where would the right place be to put such a
 definition? In sys/mbuf.h?


 Comments are appreciated!

 --HPS

 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFC] Allow m_dup() to use JUMBO clusters

2014-07-08 Thread Rick Macklem
Hans Petter Selasky wrote:
 
  Hi,
 
  Would it be better if my patch used the PAGE_SIZE clusters instead
  of
  the 16K ones? Then it should not be affected by memory
  defragmentation.
  Thanks for shedding some light into this area?
 
Well, I ran into the threads stuck on btalloc when I used PAGE_SIZE
clusters mixed with MCLBYTES clusters and from what I could figure, it
was a kernel address space fragmentation issue.

I would guess that PAGE_SIZE clusters aren't as bad as 16K clusters w.r.t.
fragmentation, but I believe that they could still be an issue. (My testing
was on a 256Mbyte i386, so I can't say if amd64 systems will have a problem,
just that small 32bit arches will.)

rick

  --HPS
 
 
 Hi,
 
 Updated patch attached.
 
 --HPS
 
 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to
 freebsd-net-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [RFC] Allow m_dup() to use JUMBO clusters

2014-07-08 Thread Rick Macklem
John-Mark Gurney wrote:
 Hans Petter Selasky wrote this message on Mon, Jul 07, 2014 at 10:12
 +0200:
  I'm asking for some input on the attached m_dup() patch, so that
  existing functionality or dependencies are not broken. The
  background
  for the change is to allow m_dup() to defrag long mbuf chains that
  doesn't fit into a specific hardware's scatter gather entries,
  typically
  when doing TSO.
  
  In my case the HW limit is 16 entries of length 4K for doing a
  64KByte
  TSO packet. Currently m_dup() is at best producing 32 entries of
  each 2K
  for a 64Kbytes TSO packet.
  
  By allowing m_dup() to get JUMBO clusters when allocating mbufs, we
  avoid creating a new function, specific to the hardware, to defrag
  some
  rare-occurring very long mbuf chains into a mbuf chain below 16
  entries.
 
 Please no... Until we get a better allocator, we should not use jumbo
 (page sized) mbufs otherwise we will quickly fail to allocate mbufs
 after a machine has been up for a long while causing other
 failures...
 
 Unless of course if the code fails to allocate the largest cluster it
 falls through to trying to allocate the next smaller size, that might
 be better...
 
Unfortunately, for the can't allocate boundary tags case, the allocation
request with M_NOWAIT loops instead of failing.

I tried:
  m = m_getjcl(M_NOWAIT..M_JUMPAGESIZE);
  if (m == NULL)
   m = getjcl(M_WAITOK..MCLBYTES);
when I was experimenting with MJUMPAGESIZE clusters for NFS and what happened
was the thread looped in the first m_getjcl() instead of returning NULL.
It is about 12 layers of function calls deep and most fail/return NULL, but
somewhere one of them decides to try again. I didn't locate the location
of that and don't know if it would be safe to change it so that m_getjcl()
returns NULL for this case.

rick


 --
   John-Mark GurneyVoice: +1 415 225 5579
 
  All that I will do, has been done, All that I have, has not.
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to
 freebsd-current-unsubscr...@freebsd.org
 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org