Yes quite substantial. On firecracker ZFS needs at least 50-60 ms to initialize 
on my machine. Whereas RoFS images takes 1 millisecond - the smallest native 
example takes 5-6 ms to boot including RoFS mount and ~10ms in total to execute 
(10 ms includes that 5-6 ms of boot time). 

Sent from my iPhone

> On Sep 20, 2019, at 15:53, zhiting zhu <[email protected]> wrote:
> 
> Is there any difference on boot time between zfs and rofs? 
> 
>> On Fri, Sep 20, 2019 at 2:45 PM Henrique Fingler <[email protected]> wrote:
>>  I'll check that out.
>> 
>>> Instead of detecting what hypervisor we are dealing with, we should simply 
>>> act accordingly based on what features have been negotiated and agreed
>> 
>>  Yep, you're right. Five minutes after I hit Post I remembered what 
>> "negotiate" means. Whoops.
>> 
>>> Also, I have noticed with my simple patch OSv ends up allocating 256 
>>> buffers on Firecracker
>> 
>>  That's why I was trying to force the size of the recv queue to one. But 
>> this can be done in a smarter way in net::fill_rx_ring() as you said. I'll 
>> hack around and see what comes up.
>>  It also seems that Firecracker has the machinery to implement 
>> VIRTIO_NET_F_MRG_RXBUF, but I don't know how complicated it would be to 
>> finish it. I might check that out in a few weeks when I have some free time.
>> 
>>  Thanks for all the pointers!
>> 
>> 
>>> On Friday, September 20, 2019 at 1:58:42 PM UTC-5, Waldek Kozaczuk wrote:
>>> 
>>> 
>>>> On Friday, September 20, 2019 at 8:56:35 AM UTC-4, Waldek Kozaczuk wrote:
>>>> See my answers below.
>>>> 
>>>>> On Thursday, September 19, 2019 at 11:34:56 PM UTC-4, Henrique Fingler 
>>>>> wrote:
>>>>>  I agree that this is mostly a thing that should be done on Firecracker. 
>>>>> For now, if there's a way to detect the hypervisor we can switch that. 
>>>>> Personally I'm only using Firecracker so I'll leave this in.
>>>> Instead of detecting what hypervisor we are dealing with, we should simply 
>>>> act accordingly based on what features have been negotiated and agreed 
>>>> between OSv (driver) and hypervisor (device). We should simply follow the 
>>>> VirtIo spec as it says here:
>>>> "5.1.6.3.1 Driver Requirements: Setting Up Receive Buffers
>>>> If VIRTIO_NET_F_MRG_RXBUF is not negotiated:
>>>> If VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6 or 
>>>> VIRTIO_NET_F_GUEST_UFO are negotiated, the driver SHOULD populate the 
>>>> receive queue(s) with buffers of at least 65562 bytes.
>>>> Otherwise, the driver SHOULD populate the receive queue(s) with buffers of 
>>>> at least 1526 bytes.
>>>> If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer MUST be at greater 
>>>> than the size of the struct virtio_net_hdr."
>>>> Something similar to what Linux does here - 
>>>> https://github.com/torvalds/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/drivers/net/virtio_net.c#L3075-L3080.
>>>>  So only use 17 pages long buffers when we have to. One outstanding 
>>>> question is this - shall we allocate and use a single contiguous block of 
>>>> 17 pages of memory as a single slot in the vring or chain of 17 single 
>>>> page ones like for single large buffer? (the latter is what Linux seems to 
>>>> be doing). The slight advantage of chained one is that it will be easier 
>>>> to find 17 pages of memory than 68K contiguous one under pressure. But 
>>>> handling chained buffer is going to be more complicated. I think memory 
>>>> waste is the same.
>>>> 
>>>> Pekka, Nadav,
>>>> What do you think we should do?
>>>> 
>>>> 
>>>>> 
>>>>>  I wrote pretty much the same code but instead of malloc I used 
>>>>> memory::alloc_hugepage( but it got stuck at compilation when qemu was 
>>>>> started, do you happen to know the reason? I thought we also had to force 
>>>>> the length of the receiving queue to one, maybe that part was the one 
>>>>> breaking osv under qemu.
>>>>  
>>>> Most likely you build the image with ZFS filesystem which at least now 
>>>> requires OSv to boot so that files can be uploaded to. You can avoid it by 
>>>> using Read-Only FS (fs=rofs). Either way, we should use 
>>>> VIRTIO_NET_F_MRG_RXBUF if QEMU offers it (which happens right now) and you 
>>>> patch should not affect this.
>>> 
>>> Here is a capstan doc that should somewhat explain all 3 filesystems OSv 
>>> offers - 
>>> https://github.com/cloudius-systems/capstan/blob/master/Documentation/OsvFilesystem.md.
>>>  
>>> 
>>>> 
>>>>  
>>>> 
>>>>>  And the size I was allocating was 17 pages because the spec says 65562, 
>>>>> which is 16 pages plus 26 bytes.
>>>> You are right about 17 pages. 
>>>>>  Did you also disable VIRTIO_NET_F_MRG_RXBUF in the feature mask or no, 
>>>>> since Firecracker just ignores it?
>>>> Firecracker "ignores" it in an sense that it is part of how features are 
>>>> negotiated, 
>>>>> 
>>>>>  I'll patch that in and test it out.
>>>>> 
>>>>>  Thanks!
>>>>> 
>>>>> 
>>>>>> On Thursday, September 19, 2019 at 9:58:49 PM UTC-5, Waldek Kozaczuk 
>>>>>> wrote:
>>>>>> This patch seems to do the job:
>>>>>> 
>>>>>> diff --git a/drivers/virtio-net.cc b/drivers/virtio-net.cc
>>>>>> index e78fb3af..fe5f1ae0 100644
>>>>>> --- a/drivers/virtio-net.cc
>>>>>> +++ b/drivers/virtio-net.cc
>>>>>> @@ -375,6 +375,8 @@ void net::read_config()
>>>>>>      net_i("Features: %s=%d,%s=%d", "Host TSO ECN", _host_tso_ecn, 
>>>>>> "CSUM", _csum);
>>>>>>      net_i("Features: %s=%d,%s=%d", "Guest_csum", _guest_csum, "guest 
>>>>>> tso4", _guest_tso4);
>>>>>>      net_i("Features: %s=%d", "host tso4", _host_tso4);
>>>>>> +
>>>>>> +    printf("VIRTIO_NET_F_MRG_RXBUF: %d\n", _mergeable_bufs);
>>>>>>  }
>>>>>>  
>>>>>>  /**
>>>>>> @@ -591,16 +593,19 @@ void net::fill_rx_ring()
>>>>>>      vring* vq = _rxq.vqueue;
>>>>>>  
>>>>>>      while (vq->avail_ring_not_empty()) {
>>>>>> -        auto page = memory::alloc_page();
>>>>>> +        //auto page = memory::alloc_page();
>>>>>> +        auto page = malloc(16 * memory::page_size);
>>>>>>  
>>>>>>          vq->init_sg();
>>>>>> -        vq->add_in_sg(page, memory::page_size);
>>>>>> +        vq->add_in_sg(page, memory::page_size * 16);
>>>>>>          if (!vq->add_buf(page)) {
>>>>>> -            memory::free_page(page);
>>>>>> +            //memory::free_page(page);
>>>>>> +            free(page);
>>>>>>              break;
>>>>>>          }
>>>>>>          added++;
>>>>>>      }
>>>>>> +    printf("net: Allocated %d pages\n", added * 16);
>>>>>>  
>>>>>>      trace_virtio_net_fill_rx_ring_added(_ifn->if_index, added);
>>>>>>  
>>>>>> 
>>>>>> But for sure it is just a hack. I am not sure if we should actually 
>>>>>> allocate 16 pages in one shot (which I am doing here) vs create single 
>>>>>> chained buffer made of 16 pages. Not sure how we should extract data if 
>>>>>> chained. 
>>>>>> 
>>>>>> I have also found this (based on a comment in firecracker code) - 
>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=753630. As you can 
>>>>>> see VIRTIO_NET_F_MRG_RXBUF is much more memory efficient and flexible 
>>>>>> which is what QEMU implements.
>>>>>> 
>>>>>> I am interested in what others think how we should handle this properly.
>>>>>> 
>>>>>> Either way I think it would not hurt creating an issue against 
>>>>>> Firecracker to ask supporting VIRTIO_NET_F_MRG_RXBUF.
>>>>>> 
>>>>>> Waldek
>>>>>> 
>>>>>>> On Thursday, September 19, 2019 at 6:59:22 PM UTC-4, Henrique Fingler 
>>>>>>> wrote:
>>>>>>>  I'm trying to check if it works on qemu, but scripts/run and capstan 
>>>>>>> run set the network differently than Firecracker's script.
>>>>>>>  With the regular user networking (no "-n") it works. When I try 
>>>>>>> running it with with "-n -b br0" or just "-n" the execution hangs after 
>>>>>>> printing OSv version.
>>>>>>> 
>>>>>>>  I'm trying to manually hack the allocation of a single but larger size 
>>>>>>> for the receive queue and disabling VIRTIO_NET_F_MRG_RXBUF on the 
>>>>>>> driver just to check what Firecracker does, but it seems that during 
>>>>>>> compilation a qemu instance of the unikernel is launched. Is this a 
>>>>>>> test? Can this be disabled?
>>>>>>> 
>>>>>>>  Also, is there a way to find which hypervisor OSv is running on top 
>>>>>>> of? This would help switching between feature sets in virtio-net.
>>>>>>> 
>>>>>>>  Thanks!
>>>>>>> 
>>>>>>> 
>>>>>>>> On Thursday, September 19, 2019 at 11:02:53 AM UTC-5, Waldek Kozaczuk 
>>>>>>>> wrote:
>>>>>>>> Most likely it is a bug on OSv side. It could be in the virtio-net 
>>>>>>>> features negotiation logic - 
>>>>>>>> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-net.cc#L351-L378
>>>>>>>>  or 
>>>>>>>> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-net.cc#L283-L297.
>>>>>>>>  
>>>>>>>> 
>>>>>>>> I also saw this comment in firecracker code - 
>>>>>>>> https://github.com/firecracker-microvm/firecracker/blob/master/devices/src/virtio/net.rs#L153-L154
>>>>>>>>  - which seems to indicate that VIRTIO_NET_F_MRG_RXBUF is NOT 
>>>>>>>> supported by firecracker - 
>>>>>>>> https://github.com/firecracker-microvm/firecracker/blob/f123988affa8f25683a7c26f7a48dd76e839a796/devices/src/virtio/net.rs#L705-L711?
>>>>>>>> 
>>>>>>>> This section of VirtIO spec would apply then:
>>>>>>>> 
>>>>>>>> "5.1.6.3.1 Driver Requirements: Setting Up Receive Buffers
>>>>>>>> If VIRTIO_NET_F_MRG_RXBUF is not negotiated:
>>>>>>>> If VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6 or 
>>>>>>>> VIRTIO_NET_F_GUEST_UFO are negotiated, the driver SHOULD populate the 
>>>>>>>> receive queue(s) with buffers of at least 65562 bytes.
>>>>>>>> Otherwise, the driver SHOULD populate the receive queue(s) with 
>>>>>>>> buffers of at least 1526 bytes.
>>>>>>>> If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer MUST be at 
>>>>>>>> greater than the size of the struct virtio_net_hdr."
>>>>>>>> 
>>>>>>>> This makes me think that our receive buffers are only 1 page (4096 
>>>>>>>> bytes) large so whenever Firecracker tries to send a buffer bigger 
>>>>>>>> than that OSv bounces. Think this OSv code applies - 
>>>>>>>> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-net.cc#L587-L609.
>>>>>>>>  It seems the virtio ring buffers are alway 1 page big - see 
>>>>>>>> alloc_page call. 
>>>>>>>> 
>>>>>>>> So maybe on OSv side we need to allow for bigger buffers (64K) when 
>>>>>>>> VIRTIO_NET_F_MRG_RXBUF is off which would require changes to 
>>>>>>>> drivers/virtio-vring.cc. I wonder if on QEMU this feature is on and 
>>>>>>>> that is why we never see this issue of QEMU, do we? It would be nice 
>>>>>>>> to run same Python program in qemu and see if VIRTIO_NET_F_MRG_RXBUF 
>>>>>>>> is on or off.
>>>>>>>> 
>>>>>>>> This is all my speculation and I might be off so maybe others can shed 
>>>>>>>> more light on it.
>>>>>>>> 
>>>>>>>> Waldek
>>>>>>>> 
>>>>>>>>> On Thursday, September 19, 2019 at 12:09:19 AM UTC-4, Henrique 
>>>>>>>>> Fingler wrote:
>>>>>>>>>  How do I go about disabling GSO?
>>>>>>>>>  I think I found how to disable TSO (diff below), but I can't find 
>>>>>>>>> where to disable GSO. Disabling just TSO didn't fix it.
>>>>>>>>>  
>>>>>>>>>  The loop where Firecracker gets stuck (fn rx_single_frame) tries to 
>>>>>>>>> write an entire frame (7318 bytes) and it notices it doesn't fit into 
>>>>>>>>> all the descriptors of the guest.
>>>>>>>>>  It seems that if it fails to write the entire frame, it marks 
>>>>>>>>> descriptors as used, but retries to deliver the whole frame again. 
>>>>>>>>> Maybe the OSv buffer isn't big enough and FC just loops forever?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> virtio-net.cc:
>>>>>>>>> 
>>>>>>>>>                    | (1 << VIRTIO_NET_F_STATUS)     \
>>>>>>>>>                   | (1 << VIRTIO_NET_F_CSUM)       \
>>>>>>>>>                   | (1 << VIRTIO_NET_F_GUEST_CSUM) \
>>>>>>>>> -                 | (1 << VIRTIO_NET_F_GUEST_TSO4) \
>>>>>>>>> +                 | (0 << VIRTIO_NET_F_GUEST_TSO4) \
>>>>>>>>>                   | (1 << VIRTIO_NET_F_HOST_ECN)   \
>>>>>>>>> -                 | (1 << VIRTIO_NET_F_HOST_TSO4)  \
>>>>>>>>> +                 | (0 << VIRTIO_NET_F_HOST_TSO4)  \
>>>>>>>>>                   | (1 << VIRTIO_NET_F_GUEST_ECN)
>>>>>>>>>                   | (1 << VIRTIO_NET_F_GUEST_UFO)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  Thanks!
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Wednesday, September 18, 2019 at 8:23:21 PM UTC-5, Asias He wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Thu, Sep 19, 2019 at 7:06 AM Henrique Fingler 
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>  First of all, thank you for being active and helping out users!
>>>>>>>>>>> 
>>>>>>>>>>>  Here's my setup: I'm building a python3 image, with a script that 
>>>>>>>>>>> does
>>>>>>>>>>> 
>>>>>>>>>>>  response = urllib.request.urlopen("http://<a 1mb file>")
>>>>>>>>>>> 
>>>>>>>>>>>  The execution just hangs for a few seconds, then a storm of 
>>>>>>>>>>> warnings from Firecracker show up:
>>>>>>>>>>> 
>>>>>>>>>>> <A lot of the same warning>
>>>>>>>>>>> 2019-09-18T17:50:36.841517975 
>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257] Receiving 
>>>>>>>>>>> buffer is too small to hold frame of current size
>>>>>>>>>>> 2019-09-18T17:50:36.841529410 
>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257] Receiving 
>>>>>>>>>>> buffer is too small to hold frame of current size
>>>>>>>>>>> 2019-09-18T17:50:36.841569665 
>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257] Receiving 
>>>>>>>>>>> buffer is too small to hold frame of current size
>>>>>>>>>>> 2019-09-18T17:50:36.841584097 
>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257] Receiving 
>>>>>>>>>>> buffer is too small to hold frame of current size
>>>>>>>>>>> 2019-09-18T17:50:36.841656060 
>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257] Receiving 
>>>>>>>>>>> buffer is too small to hold frame of current size
>>>>>>>>>>> 
>>>>>>>>>>>  This is coming from here:   
>>>>>>>>>>> https://github.com/firecracker-microvm/firecracker/blob/master/devices/src/virtio/net.rs
>>>>>>>>>>> 
>>>>>>>>>>>  If the file is smaller, let's say 256B, it works fine
>>>>>>>>>>> 
>>>>>>>>>>>  Could this be a bug in the virtio implementation of OSv or is it a 
>>>>>>>>>>> Firecraker thing?
>>>>>>>>>>>  I'll start to investigate the issue. I'm asking because you might 
>>>>>>>>>>> have seen this problem.
>>>>>>>>>> 
>>>>>>>>>> Try disable gso/tso in osv viriot-net driver.
>>>>>>>>>> 
>>>>>>>>>>  
>>>>>>>>>>> 
>>>>>>>>>>>  Thanks!
>>>>>>>>>>> 
>>>>>>>>>>> -- 
>>>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>>>> Groups "OSv Development" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>>> send an email to [email protected].
>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>> https://groups.google.com/d/msgid/osv-dev/965f0cad-d074-4b18-b998-ffe5777851a2%40googlegroups.com.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -- 
>>>>>>>>>> Asias
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/64ae1bcf-9506-4a52-8ca6-4b0921981f9f%40googlegroups.com.
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "OSv Development" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/osv-dev/InlSKnJAfMQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/osv-dev/CA%2B3q14xYSikYznw1iCkxtO0%2BRqmcrUirShVU-e1_Pwpp3Zd1yw%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/56900AC8-548A-4B18-98C0-48A3D75DCFC1%40gmail.com.

Reply via email to