Re: [PATCH v19 3/7] xbitmap: add more operations

2017-12-13 Thread Wei Wang

On 12/13/2017 10:16 PM, Tetsuo Handa wrote:

Wei Wang wrote:

On 12/12/2017 09:20 PM, Tetsuo Handa wrote:

Wei Wang wrote:

+void xb_clear_bit_range(struct xb *xb, unsigned long start, unsigned long end)
+{
+   struct radix_tree_root *root = >xbrt;
+   struct radix_tree_node *node;
+   void **slot;
+   struct ida_bitmap *bitmap;
+   unsigned int nbits;
+
+   for (; start < end; start = (start | (IDA_BITMAP_BITS - 1)) + 1) {
+   unsigned long index = start / IDA_BITMAP_BITS;
+   unsigned long bit = start % IDA_BITMAP_BITS;
+
+   bitmap = __radix_tree_lookup(root, index, , );
+   if (radix_tree_exception(bitmap)) {
+   unsigned long ebit = bit + 2;
+   unsigned long tmp = (unsigned long)bitmap;
+
+   nbits = min(end - start + 1, BITS_PER_LONG - ebit);
+
+   if (ebit >= BITS_PER_LONG)

What happens if we hit this "continue;" when "index == ULONG_MAX / 
IDA_BITMAP_BITS" ?

Thanks. I also improved the test case for this. I plan to change the
implementation a little bit to avoid such overflow (has passed the test
case that I have, just post out for another set of eyes):

{
...
  unsigned long idx = start / IDA_BITMAP_BITS;
  unsigned long bit = start % IDA_BITMAP_BITS;
  unsigned long idx_end = end / IDA_BITMAP_BITS;
  unsigned long ret;

  for (idx = start / IDA_BITMAP_BITS; idx <= idx_end; idx++) {
  unsigned long ida_start = idx * IDA_BITMAP_BITS;

  bitmap = __radix_tree_lookup(root, idx, , );
  if (radix_tree_exception(bitmap)) {
  unsigned long tmp = (unsigned long)bitmap;
  unsigned long ebit = bit + 2;

  if (ebit >= BITS_PER_LONG)
  continue;

Will you please please do eliminate exception path?


Please first see my explanations below, I'll try to help you understand 
it thoroughly. If it is really too complex to understand it finally, 
then I think we can start from the fundamental part by removing the 
exceptional path if no objections from others.



I can't interpret what "ebit >= BITS_PER_LONG" means.
The reason you "continue;" is that all bits beyond are "0", isn't it?
Then, it would make sense to "continue;" when finding next "1" because
all bits beyond are "0". But how does it make sense to "continue;" when
finding next "0" despite all bits beyond are "0"?



Not the case actually. Please see this example:
1) xb_set_bit(10); // bit 10 is set, so an exceptional entry (i.e. 
[0:62]) is used

2) xb_clear_bit_range(66, 2048);
- One ida bitmap size is 1024 bits, so this clear will be performed 
with 2 loops, first to clear [66, 1024), second to clear [1024, 2048)
- When the first loop clears [66, 1024), and finds that it is an 
exception entry (because bit 10 is set, and the 62 bit entry is enough 
to cover). Another point we have to remember is that an exceptional 
entry implies that the rest of bits [63, 1024) are all 0s.
- The starting bit 66 already exceeds the the exceptional entry bit 
range [0, 62], and with the fact that the rest of bits are all 0s, so it 
is time to just "continue", which goes to the second range [1024, 2048)


I used the example of xb_clear_bit_range(), and xb_find_next_bit() is 
the same fundamentally. Please let me know if anywhere still looks fuzzy.






  if (set)
  ret = find_next_bit(,
BITS_PER_LONG, ebit);
  else
  ret = find_next_zero_bit(,
BITS_PER_LONG,
   ebit);
  if (ret < BITS_PER_LONG)
  return ret - 2 + ida_start;
  } else if (bitmap) {
  if (set)
  ret = find_next_bit(bitmap->bitmap,
  IDA_BITMAP_BITS, bit);
  else
  ret = find_next_zero_bit(bitmap->bitmap,
IDA_BITMAP_BITS, bit);

"bit" may not be 0 for the first round and "bit" is always 0 afterwords.
But where is the guaranteed that "end" is a multiple of IDA_BITMAP_BITS ?
Please explain why it is correct to use IDA_BITMAP_BITS unconditionally
for the last round.


There missed something here, it will be:

nbits = min(end - ida_start + 1, IDA_BITMAP_BITS - bit);
if (set)
ret = find_next_bit(bitmap->bitmap, nbits, bit);
else
ret = find_next_zero_bit(bitmap->bitmap,
   nbits, bit);
if (ret < nbits)
return ret + ida_start;



+/**
+ * xb_find_next_set_bit - find the next set bit in a range
+ * @xb: the xbitmap to search
+ * @start: the start of the range, inclusive
+ * @end: the end of the range, exclusive
+ *

Re: [PATCH] ptr_ring: add barriers

2017-12-13 Thread George Cherian

Hi David,

On 12/11/2017 09:23 PM, David Miller wrote:

From: "Michael S. Tsirkin" 
Date: Tue, 5 Dec 2017 21:29:37 +0200


Users of ptr_ring expect that it's safe to give the
data structure a pointer and have it be available
to consumers, but that actually requires an smb_wmb
or a stronger barrier.

In absence of such barriers and on architectures that reorder writes,
consumer might read an un=initialized value from an skb pointer stored
in the skb array.  This was observed causing crashes.

To fix, add memory barriers.  The barrier we use is a wmb, the
assumption being that producers do not need to read the value so we do
not need to order these reads.

Reported-by: George Cherian 
Suggested-by: Jason Wang 
Signed-off-by: Michael S. Tsirkin 


I'm asked for asking for testing feedback and did not get it in a
reasonable amount of time.


The tests have completed more than 48 hours without any failures.
I won't interrupt the same and run for longer time.
In case of any issue I will report the same.

So I'm applying this as-is, and queueing it up for -stable.

Thank you.


Regards,
-George



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v19 1/7] xbitmap: Introduce xbitmap

2017-12-13 Thread Philippe Ombredanne
Matthew, Wei,

On Tue, Dec 12, 2017 at 12:55 PM, Wei Wang  wrote:
> From: Matthew Wilcox 
>
> The eXtensible Bitmap is a sparse bitmap representation which is
> efficient for set bits which tend to cluster.  It supports up to
> 'unsigned long' worth of bits, and this commit adds the bare bones --
> xb_set_bit(), xb_clear_bit() and xb_test_bit().
>
> Signed-off-by: Wei Wang 
> Cc: Matthew Wilcox 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Michael S. Tsirkin 
> Cc: Tetsuo Handa 

[...]

> --- /dev/null
> +++ b/include/linux/xbitmap.h
> @@ -0,0 +1,52 @@
> +/*
> + * eXtensible Bitmaps
> + * Copyright (c) 2017 Microsoft Corporation 
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 of the
> + * License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * eXtensible Bitmaps provide an unlimited-size sparse bitmap facility.
> + * All bits are initially zero.
> + */

Have you considered using the new SPDX ids here instead? eg. this
would come out as a top line this way:

> +/* SPDX-License-Identifer: GPL-2.0+ */

Overall you get less boilerplate comment and more code, so this is a
win-win for everyone. This would nicely remove the legalese
boilerplate with the same effect, unless you are a legalese lover of
course. See Thomas doc patches for extra details and while you are
it if you could spread the words in your team and use it for all past,
present and future contributions, that would be much appreciated.

And as a side benefit to me, it will help me save on paper and ink
ribbons supplies for my dot matrix line printer each time I print the
source code of the whole kernel. ;)

Thanks for your consideration there.
-- 
Cordially
Philippe Ombredanne
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2017-12-13 Thread Vincent Legoll
No need to get into the submenu to disable all VIRTIO-related
config entries.

This makes it easier to disable all VIRTIO config options
without entering the submenu. It will also enable one
to see that en/dis-abled state from the outside menu.

This is only intended to change menuconfig UI, not change
the config dependencies.

Signed-off-by: Vincent Legoll 
---
 drivers/virtio/Kconfig | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index cff773f15b7e..d485a63a8233 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -5,7 +5,10 @@ config VIRTIO
  bus, such as CONFIG_VIRTIO_PCI, CONFIG_VIRTIO_MMIO, CONFIG_RPMSG
  or CONFIG_S390_GUEST.
 
-menu "Virtio drivers"
+menuconfig VIRTIO_MENU
+   bool "Virtio drivers"
+
+if VIRTIO_MENU
 
 config VIRTIO_PCI
tristate "PCI driver for virtio devices"
@@ -79,4 +82,4 @@ config VIRTIO_MMIO_CMDLINE_DEVICES
 
 If unsure, say 'N'.
 
-endmenu
+endif # VIRTIO_MENU
-- 
2.14.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC 4/4] crypto: stm32: convert to the new crypto engine API

2017-12-13 Thread Fabien DESSENNE
Hi,


On 29/11/17 09:41, Corentin Labbe wrote:
> This patch convert the driver to the new crypto engine API.
>
> Signed-off-by: Corentin Labbe 

Tested-by: Fabien Dessenne 

> ---
>   drivers/crypto/stm32/stm32-hash.c | 22 +++---
>   1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/crypto/stm32/stm32-hash.c 
> b/drivers/crypto/stm32/stm32-hash.c
> index 4ca4a264a833..e3f9f7b04ce2 100644
> --- a/drivers/crypto/stm32/stm32-hash.c
> +++ b/drivers/crypto/stm32/stm32-hash.c
> @@ -122,6 +122,7 @@ enum stm32_hash_data_format {
>   #define HASH_DMA_THRESHOLD  50
>   
>   struct stm32_hash_ctx {
> + struct crypto_engine_reqctx enginectx;
>   struct stm32_hash_dev   *hdev;
>   unsigned long   flags;
>   
> @@ -811,7 +812,7 @@ static void stm32_hash_finish_req(struct ahash_request 
> *req, int err)
>   rctx->flags |= HASH_FLAGS_ERRORS;
>   }
>   
> - crypto_finalize_hash_request(hdev->engine, req, err);
> + crypto_finalize_request(hdev->engine, >base, err);
>   }
>   
>   static int stm32_hash_hw_init(struct stm32_hash_dev *hdev,
> @@ -828,15 +829,21 @@ static int stm32_hash_hw_init(struct stm32_hash_dev 
> *hdev,
>   return 0;
>   }
>   
> +static int stm32_hash_one_request(struct crypto_engine *engine,
> +   struct crypto_async_request *areq);
> +static int stm32_hash_prepare_req(struct crypto_engine *engine,
> +   struct crypto_async_request *areq);
> +
>   static int stm32_hash_handle_queue(struct stm32_hash_dev *hdev,
>  struct ahash_request *req)
>   {
> - return crypto_transfer_hash_request_to_engine(hdev->engine, req);
> + return crypto_transfer_request_to_engine(hdev->engine, >base);
>   }
>   
>   static int stm32_hash_prepare_req(struct crypto_engine *engine,
> -   struct ahash_request *req)
> +   struct crypto_async_request *areq)
>   {
> + struct ahash_request *req = ahash_request_cast(areq);
>   struct stm32_hash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(req));
>   struct stm32_hash_dev *hdev = stm32_hash_find_dev(ctx);
>   struct stm32_hash_request_ctx *rctx;
> @@ -855,8 +862,9 @@ static int stm32_hash_prepare_req(struct crypto_engine 
> *engine,
>   }
>   
>   static int stm32_hash_one_request(struct crypto_engine *engine,
> -   struct ahash_request *req)
> +   struct crypto_async_request *areq)
>   {
> + struct ahash_request *req = ahash_request_cast(areq);
>   struct stm32_hash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(req));
>   struct stm32_hash_dev *hdev = stm32_hash_find_dev(ctx);
>   struct stm32_hash_request_ctx *rctx;
> @@ -1033,6 +1041,9 @@ static int stm32_hash_cra_init_algs(struct crypto_tfm 
> *tfm,
>   if (algs_hmac_name)
>   ctx->flags |= HASH_FLAGS_HMAC;
>   
> + ctx->enginectx.op.do_one_request = stm32_hash_one_request;
> + ctx->enginectx.op.prepare_request = stm32_hash_prepare_req;
> + ctx->enginectx.op.unprepare_request = NULL;
>   return 0;
>   }
>   
> @@ -1493,9 +1504,6 @@ static int stm32_hash_probe(struct platform_device 
> *pdev)
>   goto err_engine;
>   }
>   
> - hdev->engine->prepare_hash_request = stm32_hash_prepare_req;
> - hdev->engine->hash_one_request = stm32_hash_one_request;
> -
>   ret = crypto_engine_start(hdev->engine);
>   if (ret)
>   goto err_engine_start;
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC] virtio-net: help live migrate SR-IOV devices

2017-12-13 Thread Alexander Duyck
On Wed, Dec 6, 2017 at 11:28 PM, achiad shochat
 wrote:
> On 5 December 2017 at 21:20, Michael S. Tsirkin  wrote:
>> On Tue, Dec 05, 2017 at 11:59:17AM +0200, achiad shochat wrote:
>>> Then we'll have a single solution for both netvsc and virtio (and any
>>> other PV device).
>>> And we could handle the VF DMA dirt issue agnostically.
>>
>> For the record, I won't block patches adding this kist to virtio
>> on the basis that they must be generic. It's not a lot
>> of code, implementation can come first, prettify later.
>
> It's not a lot of code either way.
> So I fail to understand why not to do it right from the beginning.
> For the record...

What isn't a lot of code? If you are talking about the DMA dirtying
then I would have to disagree. The big problem with the DMA is that we
have to mark a page as dirty and non-migratable as soon as it is
mapped for Rx DMA. It isn't until the driver has either unmapped the
page or the device has been disabled that we can then allow the page
to be migrated for being dirty. That ends up being the way we have to
support this if we don't have the bonding solution.

With the bonding solution we could look at doing a lightweight DMA
dirtying which would just require flagging pages as dirty after an
unmap or sync call is performed. However it requires that we shut down
the driver/device before we can complete the migration which means we
have to have the paravirtualized fail-over approach.

As far as indicating that the interfaces are meant to be enslaved I
wonder if we couldn't look at tweaking the PCI layout of the guest and
use that to indicate that a given set of interfaces are meant to be
bonded. For example the VFs are all meant to work as a part of a
multi-function device. What if we were to make virtio-net function 0
of a PCI/PCIe device, and then place any direct assigned VFs that are
meant to be a part of the bond in functions 1-7 of the device? Then it
isn't too far off from the model we have on the host where if the VF
goes away we would expect to see the traffic on the PF that is usually
occupying function 0 of a given device.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] ptr_ring: Add barriers to fix NULL-pointer exception

2017-12-13 Thread Cherian, George
Hi Michael,

_
From: Michael S. Tsirkin 
Sent: Wednesday, December 6, 2017 7:05 PM
Subject: Re: [PATCH] ptr_ring: Add barriers to fix NULL-pointer exception
To: Cherian, George 
Cc: , , 
, , , 



On Wed, Dec 06, 2017 at 09:57:41AM +, George Cherian wrote:
> While running a multiple VM testscase with each VM running iperf
> traffic between others the following kernel NULL pointer exception
> was seen.
>
> Race appears when the tun driver instance of one VM calls skb_array_produce
> (from tun_net_xmit) and the the destined VM's skb_array_consume
> (from tun_ring_recv), which could run concurrently on another core. Due to
> which the sock_wfree gets called again from the tun_ring_recv context.
>
> The fix is to add write/read barrier calls to be sure that we get proper
> values in the tun_ring_recv context.
>
> Crash log
> [35321.580227] Unable to handle kernel NULL pointer dereference at virtual 
> address 0060
> [35321.596720] pgd = 809ee552f000
> [35321.603723] [0060] *pgd=009f514ac003, *pud=009f54f7c003, 
> *pmd=
> [35321.620588] Internal error: Oops: 9606 1 SMP
> [35321.630461] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
> nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
> nf_defrag_ipv4 xt_4
> [35321.728501] CPU: 114 PID: 5560 Comm: qemu-system-aar Not tainted 
> 4.11.8-4k-vhe-lse+ #3
> [35321.744510] Hardware name: Cavium Inc. Unknown/Unknown, BIOS 1.0 07/24/2017
> [35321.758602] task: 80bed7fca880 task.stack: 80beb5128000
> [35321.770600] PC is at sock_wfree+0x24/0x80
> [35321.778746] LR is at skb_release_head_state+0x68/0xf8
> [35321.788979] pc : [] lr : [] pstate: 
> 40400149
> [35321.803930] sp : 80beb512bc30
> [35321.810648] x29: 80beb512bc30 x28: 80bed7fca880
> [35321.821400] x27: 004e x26: 
> [35321.832156] x25: 000c x24: 
> [35321.842947] x23: 809ece3e4900 x22: 80beb512be00
> [35321.853729] x21: 09138000 x20: 0144
> [35321.864507] x19:  x18: 0014
> [35321.875276] x17: 9d9689a0 x16: 0828b3f8
> [35321.886048] x15: 504d7b00 x14: e90ab50c48680a08
> [35321.896824] x13: 010117773f52 x12: 1080d422c00e5db6
> [35321.907595] x11: 68c322bd3930cf7a x10: a8c0d07aa8c0ad16
> [35321.918352] x9 : 1da4ed90 x8 : b50c48680a080101
> [35321.929099] x7 : 17770c521080 x6 : 1d6c13f2
> [35321.939865] x5 : 1d6c13f2 x4 : 000e
> [35321.950619] x3 : 00085ff97d82 x2 : 
> [35321.961376] x1 : 08a772d8 x0 : 0500
> [35321.975193] Process qemu-system-aar (pid: 5560, stack limit = 
> 0x80beb5128000)
> [35321.990347] Stack: (0x80beb512bc30 to 0x80beb512c000)
> [35322.001982] bc20: 80beb512bc50 08a79238
> [35322.017817] bc40: 809e8fd7be00 004e 80beb512bc70 
> 08a79488
> [35322.033651] bc60: 809e8fd7be00 0881307c 80beb512bc90 
> 08a79678
> [35322.049489] bc80: 809e8fd7be00 80beb512be00 80beb512bcb0 
> 08812f24
> [35322.065321] bca0: 809e8fd7be00 004e 80beb512bd50 
> 088133f0
> [35322.081165] bcc0: 809ece3e4900 00011000 80beb512bdd8 
> 80beb512be00
> [35322.097001] bce0: 1d6c13a4 0015 0124 
> 003f
> [35322.112866] bd00: 08bc2000 0847b5ac 0002 
> 80be0008
> [35322.128701] bd20: 00220001 80bed7fc0010 08100c38 
> 
> [35322.144539] bd40:  00040b08 80beb512bd80 
> 08288f80
> [35322.160395] bd60: 09138000 809ee7cd3500 00011000 
> 80beb512beb0
> [35322.176255] bd80: 80beb512be30 0828a224 00011000 
> 809ee7cd3500
> [35322.192109] bda0: 1d6c13a4 80beb512beb0 00011000 
> 
> [35322.207974] bdc0:  1d6c13a4 00011000 
> 809ee7cd3500
> [35322.223822] bde0: 004e   
> 
> [35322.239661] be00: 80be 004e 00010fb2 
> 80beb512bdc8
> [35322.255519] be20: 0001 00040b08 80beb512be70 
> 0828b464
> [35322.271392] be40: 09138000 809ee7cd3501 809ee7cd3500 
> 1d6c13a4
> [35322.287255] be60: 00011000 0015  
> 080833f0
> [35322.303090] be80:  80bef0071000  
> 9d9689cc
> [35322.318951] bea0: 8000 80bef0071000 004e 
> 00040b08
> 

Re: [PATCH RFC 1/4] crypto: engine - Permit to enqueue all async requests

2017-12-13 Thread Fabien DESSENNE


On 29/11/17 09:41, Corentin Labbe wrote:
> The crypto engine could actually only enqueue hash and ablkcipher request.
> This patch permit it to enqueue any type of crypto_async_request.
>
> Signed-off-by: Corentin Labbe 
> ---
>   crypto/crypto_engine.c  | 188 
> +++-
>   include/crypto/engine.h |  46 +---
>   2 files changed, 60 insertions(+), 174 deletions(-)
>
> diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
> index 61e7c4e02fd2..f7c4c4c1f41b 100644
> --- a/crypto/crypto_engine.c
> +++ b/crypto/crypto_engine.c
> @@ -34,11 +34,10 @@ static void crypto_pump_requests(struct crypto_engine 
> *engine,
>bool in_kthread)
>   {
>   struct crypto_async_request *async_req, *backlog;
> - struct ahash_request *hreq;
> - struct ablkcipher_request *breq;
>   unsigned long flags;
>   bool was_busy = false;
> - int ret, rtype;
> + int ret;
> + struct crypto_engine_reqctx *enginectx;
>   
>   spin_lock_irqsave(>queue_lock, flags);
>   
> @@ -94,7 +93,6 @@ static void crypto_pump_requests(struct crypto_engine 
> *engine,
>   
>   spin_unlock_irqrestore(>queue_lock, flags);
>   
> - rtype = crypto_tfm_alg_type(engine->cur_req->tfm);
>   /* Until here we get the request need to be encrypted successfully */
>   if (!was_busy && engine->prepare_crypt_hardware) {
>   ret = engine->prepare_crypt_hardware(engine);
> @@ -104,57 +102,31 @@ static void crypto_pump_requests(struct crypto_engine 
> *engine,
>   }
>   }
>   
> - switch (rtype) {
> - case CRYPTO_ALG_TYPE_AHASH:
> - hreq = ahash_request_cast(engine->cur_req);
> - if (engine->prepare_hash_request) {
> - ret = engine->prepare_hash_request(engine, hreq);
> - if (ret) {
> - dev_err(engine->dev, "failed to prepare 
> request: %d\n",
> - ret);
> - goto req_err;
> - }
> - engine->cur_req_prepared = true;
> - }
> - ret = engine->hash_one_request(engine, hreq);
> - if (ret) {
> - dev_err(engine->dev, "failed to hash one request from 
> queue\n");
> - goto req_err;
> - }
> - return;
> - case CRYPTO_ALG_TYPE_ABLKCIPHER:
> - breq = ablkcipher_request_cast(engine->cur_req);
> - if (engine->prepare_cipher_request) {
> - ret = engine->prepare_cipher_request(engine, breq);
> - if (ret) {
> - dev_err(engine->dev, "failed to prepare 
> request: %d\n",
> - ret);
> - goto req_err;
> - }
> - engine->cur_req_prepared = true;
> - }
> - ret = engine->cipher_one_request(engine, breq);
> + enginectx = crypto_tfm_ctx(async_req->tfm);
> +
> + if (enginectx->op.prepare_request) {
> + ret = enginectx->op.prepare_request(engine, async_req);
>   if (ret) {
> - dev_err(engine->dev, "failed to cipher one request from 
> queue\n");
> + dev_err(engine->dev, "failed to prepare request: %d\n",
> + ret);
>   goto req_err;
>   }
> - return;
> - default:
> - dev_err(engine->dev, "failed to prepare request of unknown 
> type\n");
> - return;
> + engine->cur_req_prepared = true;
> + }
> + if (!enginectx->op.do_one_request) {
> + dev_err(engine->dev, "failed to do request\n");
> + ret = -EINVAL;
> + goto req_err;
> + }
> + ret = enginectx->op.do_one_request(engine, async_req);
> + if (ret) {
> + dev_err(engine->dev, "failed to hash one request from queue\n");
> + goto req_err;
>   }
> + return;
>   
>   req_err:
> - switch (rtype) {
> - case CRYPTO_ALG_TYPE_AHASH:
> - hreq = ahash_request_cast(engine->cur_req);
> - crypto_finalize_hash_request(engine, hreq, ret);
> - break;
> - case CRYPTO_ALG_TYPE_ABLKCIPHER:
> - breq = ablkcipher_request_cast(engine->cur_req);
> - crypto_finalize_cipher_request(engine, breq, ret);
> - break;
> - }
> + crypto_finalize_request(engine, async_req, ret);
>   return;
>   
>   out:
> @@ -170,59 +142,16 @@ static void crypto_pump_work(struct kthread_work *work)
>   }
>   
>   /**
> - * crypto_transfer_cipher_request - transfer the new request into the
> - * enginequeue
> + * crypto_transfer_request - transfer the new request into the engine queue
>* @engine: the hardware engine
>* @req: the 

Re: [RFC] virtio-net: help live migrate SR-IOV devices

2017-12-13 Thread Jakub Kicinski
On Tue, 5 Dec 2017 11:59:17 +0200, achiad shochat wrote:
>  I second Jacob - having a netdev of one device driver enslave a netdev
>  of another device driver is an awkward a-symmetric model.
>  Regardless of whether they share the same backend device.
>  Only I am not sure the Linux Bond is the right choice.
>  e.g one may well want to use the virtio device also when the
>  pass-through device is available, e.g for multicasts, east-west
>  traffic, etc.
>  I'm not sure the Linux Bond fits that functionality.
>  And, as I hear in this thread, it is hard to make it work out of the box.
>  So I think the right thing would be to write a new dedicated module
>  for this purpose.  
> >
> > This part I can sort of agree with. What if we were to look at
> > providing a way to somehow advertise that the two devices were meant
> > to be boded for virtualization purposes? For now lets call it a
> > "virt-bond". Basically we could look at providing a means for virtio
> > and VF drivers to advertise that they want this sort of bond. Then it
> > would just be a matter of providing some sort of side channel to
> > indicate where you want things like multicast/broadcast/east-west
> > traffic to go.
> 
> I like this approach.

+1 on a separate driver, just enslaving devices to virtio may break
existing setups.  If people are bonding from user space today, if they
update their kernel it may surprise them how things get auto-mangled.

Is what Alex is suggesting a separate PV device that says "I would
like to be a bond of those two interfaces"?  That would make the HV
intent explicit and kernel decisions more understandable.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] ptr_ring: add barriers

2017-12-13 Thread George Cherian

Hi Michael,


On 12/06/2017 12:59 AM, Michael S. Tsirkin wrote:

Users of ptr_ring expect that it's safe to give the
data structure a pointer and have it be available
to consumers, but that actually requires an smb_wmb
or a stronger barrier.

This is not the exact situation we are seeing.
Let me try to explain the situation

Affected on ARM64 platform.
1) tun_net_xmit calls skb_array_produce, which pushes the skb to the 
ptr_ring, this push is protected by a producer_lock.


2)Prior to this call the tun_net_xmit calls skb_orphan which calls the 
skb->destructor and sets skb->destructor and skb->sk as NULL.


2.a) These 2 writes are getting reordered

3) At the same time in the receive side (tun_ring_recv), which gets 
executed in another core calls skb_array_consume which pulls the skb 
from  ptr ring, this pull is protected by a consumer lock.


4) eventually calling the skb->destructor (sock_wfree) with stale values.

Also note that this issue is reproducible in a long run and doesn't 
happen immediately after the launch of multiple VM's (infact the 
particular test cases launches 56 VM's which does iperf back and forth)




In absence of such barriers and on architectures that reorder writes,
consumer might read an un=initialized value from an skb pointer stored
in the skb array.  This was observed causing crashes.

To fix, add memory barriers.  The barrier we use is a wmb, the
assumption being that producers do not need to read the value so we do
not need to order these reads.
It is not the case that producer is reading the value, but the consumer 
reading stale value. So we need to have a strict rmb in place .




Reported-by: George Cherian 
Suggested-by: Jason Wang 
Signed-off-by: Michael S. Tsirkin 
---

George, could you pls report whether this patch fixes
the issue for you?

This seems to be needed in stable as well.




  include/linux/ptr_ring.h | 9 +
  1 file changed, 9 insertions(+)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 37b4bb2..6866df4 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -101,12 +101,18 @@ static inline bool ptr_ring_full_bh(struct ptr_ring *r)
  
  /* Note: callers invoking this in a loop must use a compiler barrier,

   * for example cpu_relax(). Callers must hold producer_lock.
+ * Callers are responsible for making sure pointer that is being queued
+ * points to a valid data.
   */
  static inline int __ptr_ring_produce(struct ptr_ring *r, void *ptr)
  {
if (unlikely(!r->size) || r->queue[r->producer])
return -ENOSPC;
  
+	/* Make sure the pointer we are storing points to a valid data. */

+   /* Pairs with smp_read_barrier_depends in __ptr_ring_consume. */
+   smp_wmb();
+
r->queue[r->producer++] = ptr;
if (unlikely(r->producer >= r->size))
r->producer = 0;
@@ -275,6 +281,9 @@ static inline void *__ptr_ring_consume(struct ptr_ring *r)
if (ptr)
__ptr_ring_discard_one(r);
  
+	/* Make sure anyone accessing data through the pointer is up to date. */

+   /* Pairs with smp_wmb in __ptr_ring_produce. */
+   smp_read_barrier_depends();
return ptr;
  }
  


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] ptr_ring: Add barriers to fix NULL-pointer exception

2017-12-13 Thread George Cherian
While running a multiple VM testscase with each VM running iperf
traffic between others the following kernel NULL pointer exception
was seen.

Race appears when the tun driver instance of one VM calls skb_array_produce
(from tun_net_xmit) and the the destined VM's skb_array_consume
(from tun_ring_recv), which could run concurrently on another core. Due to
which the sock_wfree gets called again from the tun_ring_recv context.

The fix is to add write/read barrier calls to be sure that we get proper
values in the tun_ring_recv context.

Crash log
[35321.580227] Unable to handle kernel NULL pointer dereference at virtual 
address 0060
[35321.596720] pgd = 809ee552f000
[35321.603723] [0060] *pgd=009f514ac003, *pud=009f54f7c003, 
*pmd=
[35321.620588] Internal error: Oops: 9606 1 SMP
[35321.630461] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
nf_defrag_ipv4 xt_4
[35321.728501] CPU: 114 PID: 5560 Comm: qemu-system-aar Not tainted 
4.11.8-4k-vhe-lse+ #3
[35321.744510] Hardware name: Cavium Inc. Unknown/Unknown, BIOS 1.0 07/24/2017
[35321.758602] task: 80bed7fca880 task.stack: 80beb5128000
[35321.770600] PC is at sock_wfree+0x24/0x80
[35321.778746] LR is at skb_release_head_state+0x68/0xf8
[35321.788979] pc : [] lr : [] pstate: 
40400149
[35321.803930] sp : 80beb512bc30
[35321.810648] x29: 80beb512bc30 x28: 80bed7fca880
[35321.821400] x27: 004e x26: 
[35321.832156] x25: 000c x24: 
[35321.842947] x23: 809ece3e4900 x22: 80beb512be00
[35321.853729] x21: 09138000 x20: 0144
[35321.864507] x19:  x18: 0014
[35321.875276] x17: 9d9689a0 x16: 0828b3f8
[35321.886048] x15: 504d7b00 x14: e90ab50c48680a08
[35321.896824] x13: 010117773f52 x12: 1080d422c00e5db6
[35321.907595] x11: 68c322bd3930cf7a x10: a8c0d07aa8c0ad16
[35321.918352] x9 : 1da4ed90 x8 : b50c48680a080101
[35321.929099] x7 : 17770c521080 x6 : 1d6c13f2
[35321.939865] x5 : 1d6c13f2 x4 : 000e
[35321.950619] x3 : 00085ff97d82 x2 : 
[35321.961376] x1 : 08a772d8 x0 : 0500
[35321.975193] Process qemu-system-aar (pid: 5560, stack limit = 
0x80beb5128000)
[35321.990347] Stack: (0x80beb512bc30 to 0x80beb512c000)
[35322.001982] bc20: 80beb512bc50 08a79238
[35322.017817] bc40: 809e8fd7be00 004e 80beb512bc70 
08a79488
[35322.033651] bc60: 809e8fd7be00 0881307c 80beb512bc90 
08a79678
[35322.049489] bc80: 809e8fd7be00 80beb512be00 80beb512bcb0 
08812f24
[35322.065321] bca0: 809e8fd7be00 004e 80beb512bd50 
088133f0
[35322.081165] bcc0: 809ece3e4900 00011000 80beb512bdd8 
80beb512be00
[35322.097001] bce0: 1d6c13a4 0015 0124 
003f
[35322.112866] bd00: 08bc2000 0847b5ac 0002 
80be0008
[35322.128701] bd20: 00220001 80bed7fc0010 08100c38 

[35322.144539] bd40:  00040b08 80beb512bd80 
08288f80
[35322.160395] bd60: 09138000 809ee7cd3500 00011000 
80beb512beb0
[35322.176255] bd80: 80beb512be30 0828a224 00011000 
809ee7cd3500
[35322.192109] bda0: 1d6c13a4 80beb512beb0 00011000 

[35322.207974] bdc0:  1d6c13a4 00011000 
809ee7cd3500
[35322.223822] bde0: 004e   

[35322.239661] be00: 80be 004e 00010fb2 
80beb512bdc8
[35322.255519] be20: 0001 00040b08 80beb512be70 
0828b464
[35322.271392] be40: 09138000 809ee7cd3501 809ee7cd3500 
1d6c13a4
[35322.287255] be60: 00011000 0015  
080833f0
[35322.303090] be80:  80bef0071000  
9d9689cc
[35322.318951] bea0: 8000 80bef0071000 004e 
00040b08
[35322.334771] bec0: 000e 1d6c13a4 00011000 
9cc89108
[35322.350640] bee0: 0002 9cc89000 9cc896f0 

[35322.366500] bf00: 003f 1da4ed90 a8c0d07aa8c0ad16 
68c322bd3930cf7a
[35322.382358] bf20: 1080d422c00e5db6 010117773f52 e90ab50c48680a08 
504d7b00
[35322.398209] bf40:  9d9689a0 0014 
0030
[35322.414063] bf60: 1d6c13a4 1d6c0db0 1d6d1db0 
006fbbc8
[35322.429901] bf80: 00011000 1d6ab3e8 1d6ab3a4 
007ee4c8
[35322.445751] bfa0:  

Re: [RFC] virtio-net: help live migrate SR-IOV devices

2017-12-13 Thread Alexander Duyck
On Mon, Dec 4, 2017 at 1:51 AM, achiad shochat
 wrote:
> On 3 December 2017 at 19:35, Stephen Hemminger
>  wrote:
>> On Sun, 3 Dec 2017 11:14:37 +0200
>> achiad shochat  wrote:
>>
>>> On 3 December 2017 at 07:05, Michael S. Tsirkin  wrote:
>>> > On Fri, Dec 01, 2017 at 12:08:59PM -0800, Shannon Nelson wrote:
>>> >> On 11/30/2017 6:11 AM, Michael S. Tsirkin wrote:
>>> >> > On Thu, Nov 30, 2017 at 10:08:45AM +0200, achiad shochat wrote:
>>> >> > > Re. problem #2:
>>> >> > > Indeed the best way to address it seems to be to enslave the VF 
>>> >> > > driver
>>> >> > > netdev under a persistent anchor netdev.
>>> >> > > And it's indeed desired to allow (but not enforce) PV netdev and VF
>>> >> > > netdev to work in conjunction.
>>> >> > > And it's indeed desired that this enslavement logic work out-of-the 
>>> >> > > box.
>>> >> > > But in case of PV+VF some configurable policies must be in place (and
>>> >> > > they'd better be generic rather than differ per PV technology).
>>> >> > > For example - based on which characteristics should the PV+VF 
>>> >> > > coupling
>>> >> > > be done? netvsc uses MAC address, but that might not always be the
>>> >> > > desire.
>>> >> >
>>> >> > It's a policy but not guest userspace policy.
>>> >> >
>>> >> > The hypervisor certainly knows.
>>> >> >
>>> >> > Are you concerned that someone might want to create two devices with 
>>> >> > the
>>> >> > same MAC for an unrelated reason?  If so, hypervisor could easily set a
>>> >> > flag in the virtio device to say "this is a backup, use MAC to find
>>> >> > another device".
>>> >>
>>> >> This is something I was going to suggest: a flag or other configuration 
>>> >> on
>>> >> the virtio device to help control how this new feature is used.  I can
>>> >> imagine this might be useful to control from either the hypervisor side 
>>> >> or
>>> >> the VM side.
>>> >>
>>> >> The hypervisor might want to (1) disable it (force it off), (2) enable it
>>> >> for VM choice, or (3) force it on for the VM.  In case (2), the VM might 
>>> >> be
>>> >> able to chose whether it wants to make use of the feature, or stick with 
>>> >> the
>>> >> bonding solution.
>>> >>
>>> >> Either way, the kernel is making a feature available, and the user (VM or
>>> >> hypervisor) is able to control it by selecting the feature based on the
>>> >> policy desired.
>>> >>
>>> >> sln
>>> >
>>> > I'm not sure what's the feature that is available here.
>>> >
>>> > I saw this as a flag that says "this device shares backend with another
>>> > network device which can be found using MAC, and that backend should be
>>> > preferred".  kernel then forces configuration which uses that other
>>> > backend - as long as it exists.
>>> >
>>> > However, please Cc virtio-dev mailing list if we are doing this since
>>> > this is a spec extension.
>>> >
>>> > --
>>> > MST
>>>
>>>
>>> Can someone please explain why assume a virtio device is there at all??
>>> I specified a case where there isn't any.

Migrating without any virtual device is going to be extremely
challenging, especially in any kind of virtualization setup where the
hosts are not homogeneous. By providing a virtio interface you can
guarantee that at least 1 network interface is available on any given
host, and then fail over to that as the least common denominator for
any migration.

>>> I second Jacob - having a netdev of one device driver enslave a netdev
>>> of another device driver is an awkward a-symmetric model.
>>> Regardless of whether they share the same backend device.
>>> Only I am not sure the Linux Bond is the right choice.
>>> e.g one may well want to use the virtio device also when the
>>> pass-through device is available, e.g for multicasts, east-west
>>> traffic, etc.
>>> I'm not sure the Linux Bond fits that functionality.
>>> And, as I hear in this thread, it is hard to make it work out of the box.
>>> So I think the right thing would be to write a new dedicated module
>>> for this purpose.

This part I can sort of agree with. What if we were to look at
providing a way to somehow advertise that the two devices were meant
to be boded for virtualization purposes? For now lets call it a
"virt-bond". Basically we could look at providing a means for virtio
and VF drivers to advertise that they want this sort of bond. Then it
would just be a matter of providing some sort of side channel to
indicate where you want things like multicast/broadcast/east-west
traffic to go.

>>> Re policy -
>>> Indeed the HV can request a policy from the guest but that's not a
>>> claim for the virtio device enslaving the pass-through device.
>>> Any policy can be queried by the upper enslaving device.
>>>
>>> Bottom line - I do not see a single reason to have the virtio netdev
>>> (nor netvsc or any other PV netdev) enslave another netdev by itself.
>>> If we'd do it right with netvsc from the beginning we wouldn't need
>>> this 

Re: [PATCHv2] virtio_mmio: fix devm cleanup

2017-12-13 Thread Mark Rutland
On Tue, Dec 12, 2017 at 06:02:23PM +0100, Cornelia Huck wrote:
> On Tue, 12 Dec 2017 13:45:50 +
> Mark Rutland  wrote:
> 
> > Recent rework of the virtio_mmio probe/remove paths balanced a
> > devm_ioremap() with an iounmap() rather than its devm variant. This ends
> > up corrupting the devm datastructures, and results in the following
> > boot-time splat on arm64 under QEMU 2.9.0:
> > 
> > [3.450397] [ cut here ]
> > [3.453822] Trying to vfree() nonexistent vm area (c05b4844)
> > [3.460534] WARNING: CPU: 1 PID: 1 at mm/vmalloc.c:1525 
> > __vunmap+0x1b8/0x220
> > [3.475898] Kernel panic - not syncing: panic_on_warn set ...
> > [3.475898]
> > [3.493933] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc3 #1
> > [3.513109] Hardware name: linux,dummy-virt (DT)
> > [3.525382] Call trace:
> > [3.531683]  dump_backtrace+0x0/0x368
> > [3.543921]  show_stack+0x20/0x30
> > [3.547767]  dump_stack+0x108/0x164
> > [3.559584]  panic+0x25c/0x51c
> > [3.569184]  __warn+0x29c/0x31c
> > [3.576023]  report_bug+0x1d4/0x290
> > [3.586069]  bug_handler.part.2+0x40/0x100
> > [3.597820]  bug_handler+0x4c/0x88
> > [3.608400]  brk_handler+0x11c/0x218
> > [3.613430]  do_debug_exception+0xe8/0x318
> > [3.627370]  el1_dbg+0x18/0x78
> > [3.634037]  __vunmap+0x1b8/0x220
> > [3.648747]  vunmap+0x6c/0xc0
> > [3.653864]  __iounmap+0x44/0x58
> > [3.659771]  devm_ioremap_release+0x34/0x68
> > [3.672983]  release_nodes+0x404/0x880
> > [3.683543]  devres_release_all+0x6c/0xe8
> > [3.695692]  driver_probe_device+0x250/0x828
> > [3.706187]  __driver_attach+0x190/0x210
> > [3.717645]  bus_for_each_dev+0x14c/0x1f0
> > [3.728633]  driver_attach+0x48/0x78
> > [3.740249]  bus_add_driver+0x26c/0x5b8
> > [3.752248]  driver_register+0x16c/0x398
> > [3.757211]  __platform_driver_register+0xd8/0x128
> > [3.770860]  virtio_mmio_init+0x1c/0x24
> > [3.782671]  do_one_initcall+0xe0/0x398
> > [3.791890]  kernel_init_freeable+0x594/0x660
> > [3.798514]  kernel_init+0x18/0x190
> > [3.810220]  ret_from_fork+0x10/0x18
> > 
> > To fix this, we can simply rip out the explicit cleanup that the devm
> > infrastructure will do for us when our probe function returns an error
> > code, or when our remove function returns.
> > 
> > We only need to ensure that we call put_device() if a call to
> > register_virtio_device() fails in the probe path.
> > 
> > Signed-off-by: Mark Rutland 
> > Fixes: 7eb781b1bbb7136f ("virtio_mmio: add cleanup for virtio_mmio_probe")
> > Fixes: 25f32223bce5c580 ("virtio_mmio: add cleanup for virtio_mmio_remove")
> > Cc: Cornelia Huck 
> > Cc: Michael S. Tsirkin 
> > Cc: weiping zhang 
> > Cc: virtualization@lists.linux-foundation.org
> > ---
> >  drivers/virtio/virtio_mmio.c | 43 
> > +--
> >  1 file changed, 9 insertions(+), 34 deletions(-)
> 
> In the hope that I have grokked the devm_* interface by now,
> 
> Reviewed-by: Cornelia Huck 

Thanks!

Michael, could you please queue this as a fix for v4.15?

This regressed arm64 VMs booting between v4.15-rc1 and v4-15-rc2,
impacting our automated regression testing, and I'd very much like to
get back to testing pure mainline rather than mainline + local fixes.

Thanks,
Mark.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v19 3/7] xbitmap: add more operations

2017-12-13 Thread Wei Wang

On 12/12/2017 09:20 PM, Tetsuo Handa wrote:

Wei Wang wrote:

+void xb_clear_bit_range(struct xb *xb, unsigned long start, unsigned long end)
+{
+   struct radix_tree_root *root = >xbrt;
+   struct radix_tree_node *node;
+   void **slot;
+   struct ida_bitmap *bitmap;
+   unsigned int nbits;
+
+   for (; start < end; start = (start | (IDA_BITMAP_BITS - 1)) + 1) {
+   unsigned long index = start / IDA_BITMAP_BITS;
+   unsigned long bit = start % IDA_BITMAP_BITS;
+
+   bitmap = __radix_tree_lookup(root, index, , );
+   if (radix_tree_exception(bitmap)) {
+   unsigned long ebit = bit + 2;
+   unsigned long tmp = (unsigned long)bitmap;
+
+   nbits = min(end - start + 1, BITS_PER_LONG - ebit);
+
+   if (ebit >= BITS_PER_LONG)

What happens if we hit this "continue;" when "index == ULONG_MAX / 
IDA_BITMAP_BITS" ?


Thanks. I also improved the test case for this. I plan to change the 
implementation a little bit to avoid such overflow (has passed the test 
case that I have, just post out for another set of eyes):


{
...
unsigned long idx = start / IDA_BITMAP_BITS;
unsigned long bit = start % IDA_BITMAP_BITS;
unsigned long idx_end = end / IDA_BITMAP_BITS;
unsigned long ret;

for (idx = start / IDA_BITMAP_BITS; idx <= idx_end; idx++) {
unsigned long ida_start = idx * IDA_BITMAP_BITS;

bitmap = __radix_tree_lookup(root, idx, , );
if (radix_tree_exception(bitmap)) {
unsigned long tmp = (unsigned long)bitmap;
unsigned long ebit = bit + 2;

if (ebit >= BITS_PER_LONG)
continue;
if (set)
ret = find_next_bit(, 
BITS_PER_LONG, ebit);

else
ret = find_next_zero_bit(, 
BITS_PER_LONG,

 ebit);
if (ret < BITS_PER_LONG)
return ret - 2 + ida_start;
} else if (bitmap) {
if (set)
ret = find_next_bit(bitmap->bitmap,
IDA_BITMAP_BITS, bit);
else
ret = find_next_zero_bit(bitmap->bitmap,
IDA_BITMAP_BITS, bit);
if (ret < IDA_BITMAP_BITS)
return ret + ida_start;
} else if (!bitmap && !set) {
return bit + IDA_BITMAP_BITS * idx;
}
bit = 0;
}

return end;
}




Can you eliminate exception path and fold all xbitmap patches into one, and
post only one xbitmap patch without virtio-baloon changes? If exception path
is valuable, you can add exception path after minimum version is merged.
This series is too difficult for me to close corner cases.


That exception path is claimed to save memory, and I don't have a strong 
reason to remove that part.

Matthew, could we get your feedback on this?






+/**
+ * xb_find_next_set_bit - find the next set bit in a range
+ * @xb: the xbitmap to search
+ * @start: the start of the range, inclusive
+ * @end: the end of the range, exclusive
+ *
+ * Returns: the index of the found bit, or @end + 1 if no such bit is found.
+ */
+unsigned long xb_find_next_set_bit(struct xb *xb, unsigned long start,
+  unsigned long end)
+{
+   return xb_find_next_bit(xb, start, end, 1);
+}

Won't "exclusive" loose ability to handle ULONG_MAX ? Since this is a
library module, missing ability to handle ULONG_MAX sounds like an omission.
Shouldn't we pass (or return) whether "found or not" flag (e.g. strtoul() in
C library function)?

   bool xb_find_next_set_bit(struct xb *xb, unsigned long start, unsigned long 
end, unsigned long *result);
   unsigned long xb_find_next_set_bit(struct xb *xb, unsigned long start, 
unsigned long end, bool *found);


Yes, ULONG_MAX needs to be tested by xb_test_bit(). Compared to checking 
the return value, would it be the same to let the caller check for the 
ULONG_MAX boundary?


Best,
Wei


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[bpf-next V1-RFC PATCH 11/14] virtio_net: setup xdp_rxq_info

2017-12-13 Thread Jesper Dangaard Brouer
The virtio_net driver doesn't dynamically change the RX-ring queue
layout and backing pages, but instead reject XDP setup if all the
conditions for XDP is not meet.  Thus, the xdp_rxq_info also remains
fairly static.  This allow us to simply add the init+reg/unreg to
net_device open/close functions.

Driver hook points for xdp_rxq_info:
 * init+reg: virtnet_open
 * unreg   : virtnet_close

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Jesper Dangaard Brouer 
---
 drivers/net/virtio_net.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 19a985ef9104..9792183befbf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -115,6 +116,8 @@ struct receive_queue {
 
/* Name of this receive queue: input.$index */
char name[40];
+
+   struct xdp_rxq_info xdp_rxq;
 };
 
 struct virtnet_info {
@@ -556,6 +559,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
xdp.data = xdp.data_hard_start + xdp_headroom;
xdp_set_data_meta_invalid();
xdp.data_end = xdp.data + len;
+   xdp.rxq = >xdp_rxq;
orig_data = xdp.data;
act = bpf_prog_run_xdp(xdp_prog, );
 
@@ -1229,6 +1233,13 @@ static int virtnet_open(struct net_device *dev)
/* Make sure we have some buffers: if oom use wq. */
if (!try_fill_recv(vi, >rq[i], GFP_KERNEL))
schedule_delayed_work(>refill, 0);
+
+   /* XDP RX queue info */
+   xdp_rxq_info_init(>rq[i].xdp_rxq);
+   vi->rq[i].xdp_rxq.dev = dev;
+   vi->rq[i].xdp_rxq.queue_index = i;
+   xdp_rxq_info_reg(>rq[i].xdp_rxq);
+
virtnet_napi_enable(vi->rq[i].vq, >rq[i].napi);
virtnet_napi_tx_enable(vi, vi->sq[i].vq, >sq[i].napi);
}
@@ -1557,6 +1568,7 @@ static int virtnet_close(struct net_device *dev)
cancel_delayed_work_sync(>refill);
 
for (i = 0; i < vi->max_queue_pairs; i++) {
+   xdp_rxq_info_unreg(>rq[i].xdp_rxq);
napi_disable(>rq[i].napi);
virtnet_napi_tx_disable(>sq[i].napi);
}

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/3] virtio: use put_device instead of kfree

2017-12-13 Thread Cornelia Huck
On Tue, 12 Dec 2017 21:24:33 +0800
weiping zhang  wrote:

> As mentioned at drivers/base/core.c:
> /*
>  * NOTE: _Never_ directly free @dev after calling this function, even
>  * if it returned an error! Always use put_device() to give up the
>  * reference initialized in this function instead.
>  */
> so we don't free vp_vdev until vp_vdev.dev.release be called.
> 
> Signed-off-by: weiping zhang 
> ---
>  drivers/misc/mic/vop/vop_main.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)

Reviewed-by: Cornelia Huck 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization