Re: [osv-dev] Can't download "large" files using python3 on top of Firecracker: ...virtio/net.rs:257] Receiving buffer is too small to hold frame of current size

Waldek Kozaczuk Sun, 06 Oct 2019 11:16:28 -0700

Did you try my latest patch from Friday that also frees memory?

On Sun, Oct 6, 2019 at 14:14 Henrique Fingler <[email protected]> wrote:


>  That makes sense. I did try using the hugepage alloc that I had before
> but it still crashed. Thank you for the response!
>  I'll try it out with *alloc_phys_contiguous_aligned.*
>
>
> On Friday, October 4, 2019 at 2:28:01 PM UTC-5, Waldek Kozaczuk wrote:
>>
>> Hi,
>>
>> So there were a couple of issues with my and your patch:
>> 1) We should NOT be using straight *malloc*() to allocate 17 pages of
>> memory. It needs to be page-aligned, so *aligned_alloc()* is the correct
>> choice.
>> 2) I missed recognizing that we also need to *free()* instead of
>> *free_page()* in the right place.
>>
>> Please see improved patch - better hack ;-). I still think we might be
>> allocating 17-too many 17-pages large buffers. But at least we recognize
>> VIRTIO_NET_F_MRG_RXBUF is off/on and choose correct page size accordingly
>> (more/less).
>>
>> diff --git a/drivers/virtio-net.cc b/drivers/virtio-net.cc
>> index e78fb3af..0df45dce 100644
>> --- a/drivers/virtio-net.cc
>> +++ b/drivers/virtio-net.cc
>> @@ -63,6 +63,7 @@ extern int maxnic;
>>  namespace virtio {
>>
>>  int net::_instance = 0;
>> +bool net::use_large_buffer = false;
>>
>>  #define net_tag "virtio-net"
>>  #define net_d(...)   tprintf_d(net_tag, __VA_ARGS__)
>> @@ -375,6 +376,9 @@ void net::read_config()
>>      net_i("Features: %s=%d,%s=%d", "Host TSO ECN", _host_tso_ecn,
>> "CSUM", _csum);
>>      net_i("Features: %s=%d,%s=%d", "Guest_csum", _guest_csum, "guest
>> tso4", _guest_tso4);
>>      net_i("Features: %s=%d", "host tso4", _host_tso4);
>> +
>> +    printf("VIRTIO_NET_F_MRG_RXBUF: %d\n", _mergeable_bufs);
>> +    use_large_buffer = !_mergeable_bufs;
>>  }
>>
>>  /**
>> @@ -473,7 +477,10 @@ void net::receiver()
>>              // Bad packet/buffer - discard and continue to the next one
>>              if (len < _hdr_size + ETHER_HDR_LEN) {
>>                  rx_drops++;
>> -                memory::free_page(page);
>> +                if (use_large_buffer)
>> +                   free(page);
>> +                else
>> +                   memory::free_page(page);
>>
>>                  continue;
>>              }
>> @@ -581,7 +588,13 @@ void net::free_buffer_and_refcnt(void* buffer, void*
>> refcnt)
>>  void net::do_free_buffer(void* buffer)
>>  {
>>      buffer = align_down(buffer, page_size);
>> -    memory::free_page(buffer);
>> +    if (use_large_buffer) {
>> +        printf("--> Freeing 17 pages: %p\n", buffer);
>> +        free(buffer);
>> +    } else {
>> +        printf("--> Freeing single page: %p\n", buffer);
>> +        memory::free_page(buffer);
>> +    }
>>  }
>>
>>  void net::fill_rx_ring()
>> @@ -591,12 +604,23 @@ void net::fill_rx_ring()
>>      vring* vq = _rxq.vqueue;
>>
>>      while (vq->avail_ring_not_empty()) {
>> -        auto page = memory::alloc_page();
>> +        void *page;
>> +        int pages_num = use_large_buffer ? 17 : 1;
>> +        if (use_large_buffer) {
>> +            page = aligned_alloc(memory::page_size, pages_num *
>> memory::page_size);
>> +            printf("--> Allocated 17 pages: %p\n", page);
>> +        } else {
>> +            page = memory::alloc_page();
>> +            printf("--> Allocated single page: %p\n", page);
>> +        }
>>
>>          vq->init_sg();
>> -        vq->add_in_sg(page, memory::page_size);
>> +        vq->add_in_sg(page, pages_num * memory::page_size);
>>          if (!vq->add_buf(page)) {
>> -            memory::free_page(page);
>> +            if (use_large_buffer)
>> +               free(page);
>> +            else
>> +               memory::free_page(page);
>>              break;
>>          }
>>          added++;
>> diff --git a/drivers/virtio-net.hh b/drivers/virtio-net.hh
>> index adc93b39..e6725231 100644
>> --- a/drivers/virtio-net.hh
>> +++ b/drivers/virtio-net.hh
>> @@ -220,6 +220,7 @@ public:
>>      static void free_buffer_and_refcnt(void* buffer, void* refcnt);
>>      static void free_buffer(iovec iov) { do_free_buffer(iov.iov_base); }
>>      static void do_free_buffer(void* buffer);
>> +    static bool use_large_buffer;
>>
>>      bool ack_irq();
>>
>>
>> I have tested it with your python example downloading 100 times 5MB large
>> file and all seems to be working fine. Feel free to remove debug statements
>> :-)
>>
>> Waldek
>>
>> On Thursday, September 26, 2019 at 6:12:27 PM UTC-4, Henrique Fingler
>> wrote:
>>>
>>>  Waldek, I'm getting a general protection fault when doing some HTTP
>>> requests from OSv, do you think it might be related to the hack to make it
>>> work on Firecracker?
>>>
>>>  Here's the MWE, a few requests go through, then it faults.
>>>
>>> import urllib.request
>>> for i in range (10):
>>> response = urllib.request.urlopen("http://192.168.0.20:9999/1.bin";)
>>> response.read()
>>>
>>>  Here's the trace:
>>>
>>> [registers]
>>> RIP: 0x00000000403e9fd6 <memory::page_range_allocator::remove_huge(
>>> memory::page_range&)+38>
>>> RFL: 0x0000000000010202  CS:  0x0000000000000008  SS:
>>> 0x0000000000000010
>>> RAX: 0x000000000000000d  RBX: 0xffff800003f2f000  RCX:
>>> 0x6d314e7578313731  RDX: 0x6270415369447065
>>> RSI: 0xffff800003f2f000  RDI: 0xffff800003f2f008  RBP:
>>> 0xffff800000074e20  R8:  0x00000000000000fc
>>> R9:  0xffff80000094d7e8  R10: 0x0000000000003f40  R11:
>>> 0x1144029210842110  R12: 0x0000000040911300
>>> R13: 0xffff80000094d7e8  R14: 0x0000000000003f40  R15:
>>> 0x1144029210842110  RSP: 0xffff800000074e00
>>> general protection fault
>>>
>>>
>>> [backtrace]
>>> 0x000000004039cabc <general_protection+140>
>>> 0x0000000040399fa2 <???+1077518242>
>>> 0x00000000403e4d66 <memory::page_range_allocator::free(memory::
>>> page_range*)+166>
>>> 0x00000000403e4ecb <memory::page_pool::l2::free_batch(memory::page_pool
>>> ::page_batch&)+91>
>>> 0x00000000403e5118 <memory::page_pool::l2::unfill()+504>
>>> 0x00000000403e6776 <memory::page_pool::l2::fill_thread()+358>
>>> 0x00000000403ea7db <std::_Function_handler<void (), memory::page_pool::
>>> l2::l2()::{lambda()#1}>::_M_invoke(std::_Any_data const&)+11>
>>> 0x00000000403f9746 <thread_main_c+38>
>>> 0x000000004039af62 <???+1077522274>
>>>
>>>
>>>  The hack in virtio-net.cc is similar to yours:
>>>
>>> diff --git a/drivers/virtio-net.cc b/drivers/virtio-net.cc
>>> index e78fb3af..0065e8d7 100644
>>> --- a/drivers/virtio-net.cc
>>> +++ b/drivers/virtio-net.cc
>>> @@ -590,13 +590,31 @@ void net::fill_rx_ring()
>>>      int added = 0;
>>>      vring* vq = _rxq.vqueue;
>>>
>>> +#define HACKQ 1
>>> +
>>>      while (vq->avail_ring_not_empty()) {
>>> -        auto page = memory::alloc_page();
>>> +
>>> +        #if HACKQ
>>> +            auto page = malloc(17 * memory::page_size);
>>> +        #else
>>> +            auto page = memory::alloc_page();
>>> +        #endif
>>>
>>>          vq->init_sg();
>>> -        vq->add_in_sg(page, memory::page_size);
>>> +
>>> +        #if HACKQ
>>> +            vq->add_in_sg(page, 17 * memory::page_size);
>>> +        #else
>>> +            vq->add_in_sg(page, memory::page_size);
>>> +        #endif
>>> +
>>>          if (!vq->add_buf(page)) {
>>> -            memory::free_page(page);
>>> +            #if HACKQ
>>> +                free(page);
>>> +            #else
>>> +                memory::free_page(page);
>>> +            #endif
>>> +
>>>              break;
>>>          }
>>>          added++;
>>>
>>>
>>>  Maybe it's related to the size of the buffer allocated for virtio? I'll
>>> try to force it to size one and see what happens.
>>>
>>>  Best.
>>>
>>>
>>> On Friday, September 20, 2019 at 4:01:45 PM UTC-5, Waldek Kozaczuk wrote:
>>>>
>>>> Yes quite substantial. On firecracker ZFS needs at least 50-60 ms to
>>>> initialize on my machine. Whereas RoFS images takes 1 millisecond - the
>>>> smallest native example takes 5-6 ms to boot including RoFS mount and ~10ms
>>>> in total to execute (10 ms includes that 5-6 ms of boot time).
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Sep 20, 2019, at 15:53, zhiting zhu <[email protected]> wrote:
>>>>
>>>> Is there any difference on boot time between zfs and rofs?
>>>>
>>>> On Fri, Sep 20, 2019 at 2:45 PM Henrique Fingler <[email protected]>
>>>> wrote:
>>>>
>>>>>  I'll check that out.
>>>>>
>>>>> Instead of detecting what hypervisor we are dealing with, we should
>>>>>> simply act accordingly based on what features have been negotiated and
>>>>>> agreed
>>>>>
>>>>>
>>>>>  Yep, you're right. Five minutes after I hit Post I remembered what
>>>>> "negotiate" means. Whoops.
>>>>>
>>>>> Also, I have noticed with my simple patch OSv ends up allocating 256
>>>>>> buffers on Firecracker
>>>>>
>>>>>
>>>>>  That's why I was trying to force the size of the recv queue to one.
>>>>> But this can be done in a smarter way in net::fill_rx_ring() as you said.
>>>>> I'll hack around and see what comes up.
>>>>>  It also seems that Firecracker has the machinery to implement 
>>>>> VIRTIO_NET_F_MRG_RXBUF,
>>>>> but I don't know how complicated it would be to finish it. I might check
>>>>> that out in a few weeks when I have some free time.
>>>>>
>>>>>  Thanks for all the pointers!
>>>>>
>>>>>
>>>>> On Friday, September 20, 2019 at 1:58:42 PM UTC-5, Waldek Kozaczuk
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Friday, September 20, 2019 at 8:56:35 AM UTC-4, Waldek Kozaczuk
>>>>>> wrote:
>>>>>>>
>>>>>>> See my answers below.
>>>>>>>
>>>>>>> On Thursday, September 19, 2019 at 11:34:56 PM UTC-4, Henrique
>>>>>>> Fingler wrote:
>>>>>>>>
>>>>>>>>  I agree that this is mostly a thing that should be done on
>>>>>>>> Firecracker. For now, if there's a way to detect the hypervisor we can
>>>>>>>> switch that. Personally I'm only using Firecracker so I'll leave this 
>>>>>>>> in.
>>>>>>>>
>>>>>>> Instead of detecting what hypervisor we are dealing with, we should
>>>>>>> simply act accordingly based on what features have been negotiated and
>>>>>>> agreed between OSv (driver) and hypervisor (device). We should simply
>>>>>>> follow the VirtIo spec as it says here:
>>>>>>> "5.1.6.3.1 Driver Requirements: Setting Up Receive Buffers
>>>>>>>
>>>>>>>    - If VIRTIO_NET_F_MRG_RXBUF is not negotiated:
>>>>>>>       - If VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6 or
>>>>>>>       VIRTIO_NET_F_GUEST_UFO are negotiated, the driver SHOULD populate 
>>>>>>> the
>>>>>>>       receive queue(s) with buffers of at least 65562 bytes.
>>>>>>>       - Otherwise, the driver SHOULD populate the receive queue(s)
>>>>>>>       with buffers of at least 1526 bytes.
>>>>>>>    - If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer MUST be
>>>>>>>    at greater than the size of the struct virtio_net_hdr."
>>>>>>>
>>>>>>> Something similar to what Linux does here -
>>>>>>> https://github.com/torvalds/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/drivers/net/virtio_net.c#L3075-L3080.
>>>>>>> So only use 17 pages long buffers when we have to. One outstanding 
>>>>>>> question
>>>>>>> is this - shall we allocate and use a single contiguous block of 17 
>>>>>>> pages
>>>>>>> of memory as a* single slot *in the vring or *chain of 17 single
>>>>>>> page ones* like for single large buffer? (the latter is what Linux
>>>>>>> seems to be doing). The slight advantage of chained one is that it will 
>>>>>>> be
>>>>>>> easier to find 17 pages of memory than 68K contiguous one under 
>>>>>>> pressure.
>>>>>>> But handling chained buffer is going to be more complicated. I think 
>>>>>>> memory
>>>>>>> waste is the same.
>>>>>>>
>>>>>>> Pekka, Nadav,
>>>>>>> What do you think we should do?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>  I wrote pretty much the same code but instead of *malloc *I used 
>>>>>>>> *memory::alloc_hugepage(
>>>>>>>> *but it got stuck at compilation when qemu was started, do you
>>>>>>>> happen to know the reason? I thought we also had to force the length 
>>>>>>>> of the
>>>>>>>> receiving queue to one, maybe that part was the one breaking osv under 
>>>>>>>> qemu.
>>>>>>>>
>>>>>>>
>>>>>>> Most likely you build the image with ZFS filesystem which at least
>>>>>>> now requires OSv to boot so that files can be uploaded to. You can 
>>>>>>> avoid it
>>>>>>> by using Read-Only FS (fs=rofs). Either way, we should use 
>>>>>>> VIRTIO_NET_F_MRG_RXBUF
>>>>>>> if QEMU offers it (which happens right now) and you patch should not 
>>>>>>> affect
>>>>>>> this.
>>>>>>>
>>>>>>
>>>>>> Here is a capstan doc that should somewhat explain all 3 filesystems
>>>>>> OSv offers -
>>>>>> https://github.com/cloudius-systems/capstan/blob/master/Documentation/OsvFilesystem.md
>>>>>> .
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  And the size I was allocating was 17 pages because the spec says
>>>>>>>> 65562, which is 16 pages plus 26 bytes.
>>>>>>>>
>>>>>>> You are right about 17 pages.
>>>>>>>
>>>>>>>>  Did you also disable VIRTIO_NET_F_MRG_RXBUF in the feature mask or
>>>>>>>> no, since Firecracker just ignores it?
>>>>>>>>
>>>>>>> Firecracker "ignores" it in an sense that it is part of how features
>>>>>>> are negotiated,
>>>>>>>
>>>>>>>>
>>>>>>>>  I'll patch that in and test it out.
>>>>>>>>
>>>>>>>>  Thanks!
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thursday, September 19, 2019 at 9:58:49 PM UTC-5, Waldek
>>>>>>>> Kozaczuk wrote:
>>>>>>>>>
>>>>>>>>> This patch seems to do the job:
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/virtio-net.cc b/drivers/virtio-net.cc
>>>>>>>>> index e78fb3af..fe5f1ae0 100644
>>>>>>>>> --- a/drivers/virtio-net.cc
>>>>>>>>> +++ b/drivers/virtio-net.cc
>>>>>>>>> @@ -375,6 +375,8 @@ void net::read_config()
>>>>>>>>>      net_i("Features: %s=%d,%s=%d", "Host TSO ECN", _host_tso_ecn,
>>>>>>>>> "CSUM", _csum);
>>>>>>>>>      net_i("Features: %s=%d,%s=%d", "Guest_csum", _guest_csum,
>>>>>>>>> "guest tso4", _guest_tso4);
>>>>>>>>>      net_i("Features: %s=%d", "host tso4", _host_tso4);
>>>>>>>>> +
>>>>>>>>> +    printf("VIRTIO_NET_F_MRG_RXBUF: %d\n", _mergeable_bufs);
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>  /**
>>>>>>>>> @@ -591,16 +593,19 @@ void net::fill_rx_ring()
>>>>>>>>>      vring* vq = _rxq.vqueue;
>>>>>>>>>
>>>>>>>>>      while (vq->avail_ring_not_empty()) {
>>>>>>>>> -        auto page = memory::alloc_page();
>>>>>>>>> +        //auto page = memory::alloc_page();
>>>>>>>>> +        auto page = malloc(16 * memory::page_size);
>>>>>>>>>
>>>>>>>>>          vq->init_sg();
>>>>>>>>> -        vq->add_in_sg(page, memory::page_size);
>>>>>>>>> +        vq->add_in_sg(page, memory::page_size * 16);
>>>>>>>>>          if (!vq->add_buf(page)) {
>>>>>>>>> -            memory::free_page(page);
>>>>>>>>> +            //memory::free_page(page);
>>>>>>>>> +            free(page);
>>>>>>>>>              break;
>>>>>>>>>          }
>>>>>>>>>          added++;
>>>>>>>>>      }
>>>>>>>>> +    printf("net: Allocated %d pages\n", added * 16);
>>>>>>>>>
>>>>>>>>>      trace_virtio_net_fill_rx_ring_added(_ifn->if_index, added);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> But for sure it is just a hack. I am not sure if we should
>>>>>>>>> actually allocate 16 pages in one shot (which I am doing here) vs 
>>>>>>>>> create
>>>>>>>>> single chained buffer made of 16 pages. Not sure how we should 
>>>>>>>>> extract data
>>>>>>>>> if chained.
>>>>>>>>>
>>>>>>>>> I have also found this (based on a comment in firecracker code) -
>>>>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=753630. As
>>>>>>>>> you can see VIRTIO_NET_F_MRG_RXBUF is much more memory efficient and
>>>>>>>>> flexible which is what QEMU implements.
>>>>>>>>>
>>>>>>>>> I am interested in what others think how we should handle this
>>>>>>>>> properly.
>>>>>>>>>
>>>>>>>>> Either way I think it would not hurt creating an issue against
>>>>>>>>> Firecracker to ask supporting VIRTIO_NET_F_MRG_RXBUF.
>>>>>>>>>
>>>>>>>>> Waldek
>>>>>>>>>
>>>>>>>>> On Thursday, September 19, 2019 at 6:59:22 PM UTC-4, Henrique
>>>>>>>>> Fingler wrote:
>>>>>>>>>>
>>>>>>>>>>  I'm trying to check if it works on qemu, but scripts/run and
>>>>>>>>>> capstan run set the network differently than Firecracker's script.
>>>>>>>>>>  With the regular user networking (no "-n") it works. When I try
>>>>>>>>>> running it with with "-n -b br0" or just "-n" the execution hangs 
>>>>>>>>>> after
>>>>>>>>>> printing OSv version.
>>>>>>>>>>
>>>>>>>>>>  I'm trying to manually hack the allocation of a single but
>>>>>>>>>> larger size for the receive queue and disabling 
>>>>>>>>>> VIRTIO_NET_F_MRG_RXBUF
>>>>>>>>>> on the driver just to check what Firecracker does, but it seems that 
>>>>>>>>>> during
>>>>>>>>>> compilation a qemu instance of the unikernel is launched. Is this a 
>>>>>>>>>> test?
>>>>>>>>>> Can this be disabled?
>>>>>>>>>>
>>>>>>>>>>  Also, is there a way to find which hypervisor OSv is running on
>>>>>>>>>> top of? This would help switching between feature sets in virtio-net.
>>>>>>>>>>
>>>>>>>>>>  Thanks!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thursday, September 19, 2019 at 11:02:53 AM UTC-5, Waldek
>>>>>>>>>> Kozaczuk wrote:
>>>>>>>>>>>
>>>>>>>>>>> Most likely it is a bug on OSv side. It could be in the
>>>>>>>>>>> virtio-net features negotiation logic -
>>>>>>>>>>> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-net.cc#L351-L378
>>>>>>>>>>>  or
>>>>>>>>>>> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-net.cc#L283-L297
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>> I also saw this comment in firecracker code -
>>>>>>>>>>> https://github.com/firecracker-microvm/firecracker/blob/master/devices/src/virtio/net.rs#L153-L154
>>>>>>>>>>>  -
>>>>>>>>>>> which seems to indicate that VIRTIO_NET_F_MRG_RXBUF is NOT
>>>>>>>>>>> supported by firecracker -
>>>>>>>>>>> https://github.com/firecracker-microvm/firecracker/blob/f123988affa8f25683a7c26f7a48dd76e839a796/devices/src/virtio/net.rs#L705-L711
>>>>>>>>>>> ?
>>>>>>>>>>>
>>>>>>>>>>> This section of VirtIO spec would apply then:
>>>>>>>>>>>
>>>>>>>>>>> "5.1.6.3.1 Driver Requirements: Setting Up Receive Buffers
>>>>>>>>>>>
>>>>>>>>>>>    - If VIRTIO_NET_F_MRG_RXBUF is not negotiated:
>>>>>>>>>>>       - If VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6 or
>>>>>>>>>>>       VIRTIO_NET_F_GUEST_UFO are negotiated, the driver SHOULD 
>>>>>>>>>>> populate the
>>>>>>>>>>>       receive queue(s) with buffers of at least 65562 bytes.
>>>>>>>>>>>       - Otherwise, the driver SHOULD populate the receive
>>>>>>>>>>>       queue(s) with buffers of at least 1526 bytes.
>>>>>>>>>>>    - If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer MUST
>>>>>>>>>>>    be at greater than the size of the struct virtio_net_hdr."
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This makes me think that our receive buffers are only 1 page
>>>>>>>>>>> (4096 bytes) large so whenever Firecracker tries to send a buffer 
>>>>>>>>>>> bigger
>>>>>>>>>>> than that OSv bounces. Think this OSv code applies -
>>>>>>>>>>> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-net.cc#L587-L609.
>>>>>>>>>>> It seems the virtio ring buffers are alway 1 page big - see 
>>>>>>>>>>> alloc_page
>>>>>>>>>>> call.
>>>>>>>>>>>
>>>>>>>>>>> So maybe on OSv side we need to allow for bigger buffers (64K)
>>>>>>>>>>> when VIRTIO_NET_F_MRG_RXBUF is off which would require changes
>>>>>>>>>>> to drivers/virtio-vring.cc. I wonder if on QEMU this feature is
>>>>>>>>>>> on and that is why we never see this issue of QEMU, do we? It would 
>>>>>>>>>>> be nice
>>>>>>>>>>> to run same Python program in qemu and see if VIRTIO_NET_F_MRG_RXBUF
>>>>>>>>>>> is on or off.
>>>>>>>>>>>
>>>>>>>>>>> This is all my speculation and I might be off so maybe others
>>>>>>>>>>> can shed more light on it.
>>>>>>>>>>>
>>>>>>>>>>> Waldek
>>>>>>>>>>>
>>>>>>>>>>> On Thursday, September 19, 2019 at 12:09:19 AM UTC-4, Henrique
>>>>>>>>>>> Fingler wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>  How do I go about disabling GSO?
>>>>>>>>>>>>  I think I found how to disable TSO (diff below), but I can't
>>>>>>>>>>>> find where to disable GSO. Disabling just TSO didn't fix it.
>>>>>>>>>>>>
>>>>>>>>>>>>  The loop where Firecracker gets stuck (fn rx_single_frame)
>>>>>>>>>>>> tries to write an entire frame (7318 bytes) and it notices it 
>>>>>>>>>>>> doesn't fit
>>>>>>>>>>>> into all the descriptors of the guest.
>>>>>>>>>>>>  It seems that if it fails to write the entire frame, it marks
>>>>>>>>>>>> descriptors as used, but retries to deliver the whole frame again. 
>>>>>>>>>>>> Maybe
>>>>>>>>>>>> the OSv buffer isn't big enough and FC just loops forever?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> virtio-net.cc:
>>>>>>>>>>>>
>>>>>>>>>>>>                    | (1 << VIRTIO_NET_F_STATUS)     \
>>>>>>>>>>>>                   | (1 << VIRTIO_NET_F_CSUM)       \
>>>>>>>>>>>>                   | (1 << VIRTIO_NET_F_GUEST_CSUM) \
>>>>>>>>>>>> -                 | (1 << VIRTIO_NET_F_GUEST_TSO4) \
>>>>>>>>>>>> +                 | (0 << VIRTIO_NET_F_GUEST_TSO4) \
>>>>>>>>>>>>                   | (1 << VIRTIO_NET_F_HOST_ECN)   \
>>>>>>>>>>>> -                 | (1 << VIRTIO_NET_F_HOST_TSO4)  \
>>>>>>>>>>>> +                 | (0 << VIRTIO_NET_F_HOST_TSO4)  \
>>>>>>>>>>>>                   | (1 << VIRTIO_NET_F_GUEST_ECN)
>>>>>>>>>>>>                   | (1 << VIRTIO_NET_F_GUEST_UFO)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wednesday, September 18, 2019 at 8:23:21 PM UTC-5, Asias He
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Sep 19, 2019 at 7:06 AM Henrique Fingler <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>  First of all, thank you for being active and helping out
>>>>>>>>>>>>>> users!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Here's my setup: I'm building a python3 image, with a script
>>>>>>>>>>>>>> that does
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> * response = urllib.request.urlopen("http://<a 1mb file>")*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  The execution just hangs for a few seconds, then a storm of
>>>>>>>>>>>>>> warnings from Firecracker show up:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <A lot of the same warning>
>>>>>>>>>>>>>> 2019-09-18T17:50:36.841517975
>>>>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257]
>>>>>>>>>>>>>> Receiving buffer is too small to hold frame of current size
>>>>>>>>>>>>>> 2019-09-18T17:50:36.841529410
>>>>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257]
>>>>>>>>>>>>>> Receiving buffer is too small to hold frame of current size
>>>>>>>>>>>>>> 2019-09-18T17:50:36.841569665
>>>>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257]
>>>>>>>>>>>>>> Receiving buffer is too small to hold frame of current size
>>>>>>>>>>>>>> 2019-09-18T17:50:36.841584097
>>>>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257]
>>>>>>>>>>>>>> Receiving buffer is too small to hold frame of current size
>>>>>>>>>>>>>> 2019-09-18T17:50:36.841656060
>>>>>>>>>>>>>> [anonymous-instance:WARN:devices/src/virtio/net.rs:257]
>>>>>>>>>>>>>> Receiving buffer is too small to hold frame of current size
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  This is coming from here:
>>>>>>>>>>>>>> https://github.com/firecracker-microvm/firecracker/blob/master/devices/src/virtio/net.rs
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  If the file is smaller, let's say 256B, it works fine
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Could this be a bug in the virtio implementation of OSv or
>>>>>>>>>>>>>> is it a Firecraker thing?
>>>>>>>>>>>>>>  I'll start to investigate the issue. I'm asking because you
>>>>>>>>>>>>>> might have seen this problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Try disable gso/tso in osv viriot-net driver.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Thanks!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>>> Google Groups "OSv Development" group.
>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>>> https://groups.google.com/d/msgid/osv-dev/965f0cad-d074-4b18-b998-ffe5777851a2%40googlegroups.com
>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/osv-dev/965f0cad-d074-4b18-b998-ffe5777851a2%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Asias
>>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OSv Development" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/osv-dev/64ae1bcf-9506-4a52-8ca6-4b0921981f9f%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/osv-dev/64ae1bcf-9506-4a52-8ca6-4b0921981f9f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "OSv Development" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/osv-dev/InlSKnJAfMQ/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/osv-dev/CA%2B3q14xYSikYznw1iCkxtO0%2BRqmcrUirShVU-e1_Pwpp3Zd1yw%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/osv-dev/CA%2B3q14xYSikYznw1iCkxtO0%2BRqmcrUirShVU-e1_Pwpp3Zd1yw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> --
> You received this message because you are subscribed to a topic in the
> Google Groups "OSv Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/osv-dev/InlSKnJAfMQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/ae21cb99-1106-4dc9-bae4-55be207ae989%40googlegroups.com
> <https://groups.google.com/d/msgid/osv-dev/ae21cb99-1106-4dc9-bae4-55be207ae989%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CAL9cFfPTD-y-uop1gExnoz5sveYbawpnPNUKEQa5NC3wmpKObw%40mail.gmail.com.

Re: [osv-dev] Can't download "large" files using python3 on top of Firecracker: ...virtio/net.rs:257] Receiving buffer is too small to hold frame of current size

Reply via email to