Re: [PATCH] memory: Prevent array overflow in l1 page buffer pool.

Nadav Har'El Sun, 18 Sep 2016 15:44:51 -0700

By the way, I wonder if this solves
https://github.com/cloudius-systems/osv/issues/755 which features a crash
in a similar area of the code.


Justin (CC'ed) might be able to confirm if he can still reproduce issue 755.


--
Nadav Har'El
[email protected]

On Thu, Sep 15, 2016 at 2:18 PM, Player, Timmons <[email protected]
> wrote:

> The l1:refill function can call into the l2::alloc_page_batch function,
> which will drop the preempt lock if it needs to call the l2::refill
> function.  If multiple threads call into l1:refill and get blocked, then
> they all push their new pages into the l1 page array when they resume and
> may possibly overflow it.
>
> There are no sanity checks on the value of l1::nr, so once it overflows,
> things continue as normal.  The problem I observed was that the page array
> would overflow, and then lots of memory pages would be freed via
> l1::free_page_local and pushed into the page_array before l1::unfill was
> called.  This caused the page array to spill over into the next page of
> memory.
>
> Timmons
>
> From: Paweł Dziepak <[email protected]>
> Date: Wednesday, September 14, 2016 at 7:15 PM
> To: Timmons Player <[email protected]>
> Cc: Osv Dev <[email protected]>
> Subject: Re: [PATCH] memory: Prevent array overflow in l1 page buffer
> pool.
>
>
>
> On 9 September 2016 at 18:12, Timmons C. Player <
> [email protected]> wrote:
>
>> Under high levels of memory page churn, the array of pages in the
>> l1 page buffer pool can overflow and overwrite the data in adjacent
>> memory pages.  This change replaces the array with a fixed sized ring.
>>
>> It also removes the looping behavior found in the refill and unfill
>> functions.  Those functions are called by multiple threads and now
>> only do the minimum amount of work necessary to allow the callers to
>> continue.  Previously, the looping behavior in the refill thread caused
>> the array overflow due to a race on the array size variable.
>>
>
> Could you elaborate more on that race condition? I don't really understand
> why it can happen. L1 page pool is per-CPU and all accesses to it are done
> with preemption disabled so it appears to be sufficiently protected.
>
>
>>
>> The pool fill thread is now solely responsible for maintaining
>> the target pool size.
>>
>> Signed-off-by: Timmons C. Player <[email protected]>
>> ---
>>  core/mempool.cc | 67 ++++++++++++++++++++++++++++++
>> ++-------------------------
>>  1 file changed, 38 insertions(+), 29 deletions(-)
>>
>> diff --git a/core/mempool.cc b/core/mempool.cc
>> index 50c938f..fad046d 100644
>> --- a/core/mempool.cc
>> +++ b/core/mempool.cc
>> @@ -1099,9 +1099,12 @@ struct l1 {
>>      }
>>      static void* alloc_page_local();
>>      static bool free_page_local(void* v);
>> -    void* pop() { return _pages[--nr]; }
>> -    void push(void* page) { _pages[nr++] = page; }
>> -    void* top() { return _pages[nr - 1]; }
>> +    void* pop() {
>> +        void *page = nullptr;
>> +        return (_pages.pop(page) ? page : nullptr);
>> +    }
>> +    bool push(void *page) { return _pages.push(page); }
>> +    unsigned size() { return _pages.size(); }
>>      void wake_thread() { _fill_thread.wake(); }
>>      static void fill_thread();
>>      static void refill();
>> @@ -1110,11 +1113,12 @@ struct l1 {
>>      static constexpr size_t max = 512;
>>      static constexpr size_t watermark_lo = max * 1 / 4;
>>      static constexpr size_t watermark_hi = max * 3 / 4;
>> -    size_t nr = 0;
>>
>>  private:
>>      sched::thread _fill_thread;
>> -    void* _pages[max];
>> +
>> +    typedef ring_spsc<void *, max> page_ring;
>> +    page_ring _pages;
>>  };
>>
>>  struct page_batch {
>> @@ -1156,7 +1160,7 @@ public:
>>          page_batch* pb;
>>          while (!(pb = try_alloc_page_batch()) &&
>>                  // Check again since someone else might change pbuf.nr
>> when we sleep
>> -                (pbuf.nr + page_batch::nr_pages < pbuf.max / 2)) {
>> +                (pbuf.size() + page_batch::nr_pages < pbuf.max / 2)) {
>>              WITH_LOCK(migration_lock) {
>>                  DROP_LOCK(preempt_lock) {
>>                      refill();
>> @@ -1243,14 +1247,19 @@ void l1::fill_thread()
>>      for (;;) {
>>          sched::thread::wait_until([&] {
>>                  WITH_LOCK(preempt_lock) {
>> -                    return pbuf.nr < pbuf.watermark_lo || pbuf.nr >
>> pbuf.watermark_hi;
>> +                    auto nr = pbuf.size();
>> +                    return nr < pbuf.watermark_lo || nr >
>> pbuf.watermark_hi;
>>                  }
>>          });
>> -        if (pbuf.nr < pbuf.watermark_lo) {
>> -            refill();
>> +        if (pbuf.size() < pbuf.watermark_lo) {
>> +            while (pbuf.size() < pbuf.max / 2) {
>> +                refill();
>> +            }
>>          }
>> -        if (pbuf.nr > pbuf.watermark_hi) {
>> -            unfill();
>> +        if (pbuf.size() > pbuf.watermark_hi) {
>> +            while (pbuf.size() > pbuf.max / 2) {
>> +                unfill();
>> +            }
>>          }
>>      }
>>  }
>> @@ -1259,12 +1268,17 @@ void l1::refill()
>>  {
>>      SCOPE_LOCK(preempt_lock);
>>      auto& pbuf = get_l1();
>> -    while (pbuf.nr + page_batch::nr_pages < pbuf.max / 2) {
>> -        auto* pb = global_l2.alloc_page_batch(pbuf);
>> -        if (pb) {
>> +    auto* pb = global_l2.alloc_page_batch(pbuf);
>> +    if (pb) {
>> +        // Another thread might have filled the ring while we waited for
>> +        // the page batch.  Make sure there is enough room to add the
>> pages
>> +        // we just acquired, otherwise return them.
>> +        if (pbuf.size() + page_batch::nr_pages <= pbuf.max) {
>>              for (auto& page : pb->pages) {
>> -                pbuf.push(page);
>> +                assert(pbuf.push(page));
>>              }
>> +        } else {
>> +            global_l2.free_page_batch(pb);
>>          }
>>      }
>>  }
>> @@ -1273,10 +1287,12 @@ void l1::unfill()
>>  {
>>      SCOPE_LOCK(preempt_lock);
>>      auto& pbuf = get_l1();
>> -    while (pbuf.nr > page_batch::nr_pages + pbuf.max / 2) {
>> -        auto* pb = static_cast<page_batch*>(pbuf.top());
>> -        for (size_t i = 0 ; i < page_batch::nr_pages; i++) {
>> -            pb->pages[i] = pbuf.pop();
>> +    // Don't bother unfilling unless we are actually above our target
>> size.
>> +    if (pbuf.size() > pbuf.max / 2) {
>> +        auto* pb = static_cast<page_batch*>(pbuf.pop());
>> +        pb->pages[0] = pb;
>> +        for (size_t i = 1 ; i < page_batch::nr_pages; i++) {
>> +            assert(pb->pages[i] = pbuf.pop());
>>          }
>>          global_l2.free_page_batch(pb);
>>      }
>> @@ -1286,12 +1302,9 @@ void* l1::alloc_page_local()
>>  {
>>      SCOPE_LOCK(preempt_lock);
>>      auto& pbuf = get_l1();
>> -    if (pbuf.nr < pbuf.watermark_lo) {
>> +    if (pbuf.size() < pbuf.watermark_lo) {
>>          pbuf.wake_thread();
>>      }
>> -    if (pbuf.nr == 0) {
>> -        return nullptr;
>> -    }
>>      return pbuf.pop();
>>  }
>>
>> @@ -1299,14 +1312,10 @@ bool l1::free_page_local(void* v)
>>  {
>>      SCOPE_LOCK(preempt_lock);
>>      auto& pbuf = get_l1();
>> -    if (pbuf.nr > pbuf.watermark_hi) {
>> +    if (pbuf.size() > pbuf.watermark_hi) {
>>          pbuf.wake_thread();
>>      }
>> -    if (pbuf.nr == pbuf.max) {
>> -        return false;
>> -    }
>> -    pbuf.push(v);
>> -    return true;
>> +    return pbuf.push(v);
>>  }
>>
>>  // Global thread for L2 page pool
>> --
>> 2.7.4
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
>
>
> Spirent Communications e-mail confidentiality.
> ------------------------------------------------------------------------
> This e-mail contains confidential and / or privileged information
> belonging to Spirent Communications plc, its affiliates and / or
> subsidiaries. If you are not the intended recipient, you are hereby
> notified that any disclosure, copying, distribution and / or the taking of
> any action based upon reliance on the contents of this transmission is
> strictly forbidden. If you have received this message in error please
> notify the sender by return e-mail and delete it from your system.
>
> Spirent Communications plc
> Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, United
> Kingdom.
> Tel No. +44 (0) 1293 767676
> Fax No. +44 (0) 1293 767677
>
> Registered in England Number 470893
> Registered at Northwood Park, Gatwick Road, Crawley, West Sussex, RH10
> 9XN, United Kingdom.
>
> Or if within the US,
>
> Spirent Communications,
> 27349 Agoura Road, Calabasas, CA, 91301, USA.
> Tel No. 1-818-676- 2300
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [PATCH] memory: Prevent array overflow in l1 page buffer pool.

Reply via email to