Am 17.03.2014 02:33, schrieb Antti Kantee:
> On 16/03/14 22:07, Martin Unzner wrote:
>> It is great that if_virt gets a speed up, it was not excessively fast
>> the last time I tried it.
> There are two sides to if_virt. Arguably the interface is rather poorly
> named, but then again I didn't see these kinds of uses in 2008 when I
> wrote it as a way to shovel packets via /dev/tap. Using tap is
> understandably slow.
>
> These days if_virt serves more as a way to attach a hypercall
> implementation as an interface. That's what has been my target for
> recent improvements, and therefore transitively improving the
> performance of e.g. dpdk-rumptcpip and netmap-rumptcpip (which both
> attach via if_virt).
OK great, then netperf will probably work a bit better with the new BSD
release, too
>> Also, thank you for forcing me to look up jumbo frames and TSO again, I
>> had them completely confused. Neither was enabled, though, which is why
>> I suspect a netmap bug has caused the trouble.
> Not sure how missing TSO would cause bugs -- the stack should just do
> segmentation to (path)mtu-sized chunks itself in that case -- but of
> course trying to process jumbo frames without telling the stack that the
> interface layer is capable of jumbo frames will lead to discrepancies.
I will try without those extensions, that should also work. But that is
a netmap problem.
>> I have another question: Is it OK to use the rump_sys_ methods, or would
>> it be faster to do fork and schedule manually? You write that the
>> rumpns_sendto method calls curlwp itself, so it should actually not
>> matter that I simply replaced the normal system call sento with
>> rump_sys_sendto, should it? In you dissertation, you mention that
>> scheduling manually is only necessary if you are missing the wrapper, or
>> have I missed something there?
> I'm not sure I understand the question, but not deterred by that I'll
> answer anyway ;)
You got me right, although it was rather late when I wrote that.
>
> There's two things you need for a thread to run correctly in a rump
> kernel: curlwp and curcpu (it's pretty natural: you need to know what
> you're running and where you're running it). If the host thread you use
> to call the schedule operation of a rump kernel has curlwp set,
> scheduling is a matter of picking curcpu. If there is no curlwp set
> when rump_schedule() is called, the scheduling routine allocates a
> temporary one for the duration of call. The purpose of this
> curlwp-creating dynamicity is to make it as simple as possible to call a
> rump kernel from any host thread contex. The prior is optimized to be
> fast, the latter is not.
>
> The fast path is two atomic memory operations per a schedule+unschedule
> pair, and locks and releases curcpu. So, theoretically, assuming you
> have a dedicated core and would want to call sendto() a billion times,
> it would be faster to bypass rump_sys_sendto() interface by calling
> schedule manually, looping a billion times calling rumpns_sendto() and
> unscheduling. However, note that the rump kernel will still unschedule
> internally whenever it needs to block, and messing with that behaviour
> is seriously asking for deadlocks, so I'm not sure I'd start to optimize
> anything from that angle.
>
> So, yes, rump_sys_sendto() is designed to be a
> drop-in-replacement-with-no-strings-attached for sendto() -- at least
> assuming you have the corresponding socket opened ;). Calling
> rump_sys_sendto() will be significantly faster if you have curlwp set
> (what I call a "bound" thread), but it will work correctly either way.
OK, so it would probably not be bad to do something like this in netperf
(from your thesis):
rump_pub_lwproc_rfork(RUMP_RFCFDG);
rump_schedule();
for(...) {
rump_sys_sendto(...);
}
rump_unschedule();
If I got it right, rump_pub_lwproc_rfork does not perform a fork
operation in the host program, but only in the Rump kernel? Do I need to
make sure that there are no non-Rump operations before
rump_unschedule(), and if so, why? For instance, in your thesis you have:
if (mylwp->l_dupfd < 0) {
rump_unschedule();
errx(1, "open failed");
}
If you are exiting anyway, would rump_sys_reboot() work instead of
rump_unschedule?
And another question rather basic question: Is there a prominent use
case where I would need multiple lwps in Rump?
Thanks!
Martin
[1] https://mail-index.netbsd.org/netbsd-users/2011/10/09/msg009340.html
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
rumpkernel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rumpkernel-users