Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Chao Leng




On 2021/2/1 18:45, Hannes Reinecke wrote:

On 2/1/21 10:40 AM, Chao Leng wrote:



On 2021/2/1 16:57, Hannes Reinecke wrote:

On 2/1/21 9:47 AM, Chao Leng wrote:



On 2021/2/1 15:29, Hannes Reinecke wrote:[ .. ]

Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the current 
(and my fixed) versions is?
I'm still not clear where the problem is once we applied both patches.

For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:
head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;


And this is where I have issues with.
Where does 'old' come from?
Clearly it was part of the list at one point; so what happened to it?

I explained this earlier.
In nvme_ns_remove, there is a hole between list_del_rcu and
nvme_mpath_clear_current_path. If head->current_path is the "old", and
the "old" is removing. The "old" is already removed from the list by
list_del_rcu, but head->current_path is not clear to NULL by
nvme_mpath_clear_current_path.
Find path is race with nvme_ns_remove, use the "old" pass to
nvme_round_robin_path to find path.


Ah. So this should be better:

@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
  static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
     struct nvme_ns *ns)
  {
-   ns = list_next_or_null_rcu(>list, >siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
+   if (ns && !test_bit(NVME_NS_REMOVING, >flags)) {
+   ns = list_next_or_null_rcu(>list, >siblings,
+  struct nvme_ns, siblings);
+   if (ns)
+   return ns;
+   }
     return list_first_or_null_rcu(>list, struct nvme_ns, siblings);
  }

The 'NVME_NS_REMOVING' bit is set before list_del_rcu() is called, so it should 
guard against the issue you mentioned.

Looks useless, it is still infinite loop. You can check the workflow for the 
scenario I mentioned.


Cheers,

Hannes


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Hannes Reinecke

On 2/1/21 10:40 AM, Chao Leng wrote:



On 2021/2/1 16:57, Hannes Reinecke wrote:

On 2/1/21 9:47 AM, Chao Leng wrote:



On 2021/2/1 15:29, Hannes Reinecke wrote:[ .. ]

Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to 
the current (and my fixed) versions is?

I'm still not clear where the problem is once we applied both patches.
For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:

head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;


And this is where I have issues with.
Where does 'old' come from?
Clearly it was part of the list at one point; so what happened to it?

I explained this earlier.
In nvme_ns_remove, there is a hole between list_del_rcu and
nvme_mpath_clear_current_path. If head->current_path is the "old", and
the "old" is removing. The "old" is already removed from the list by
list_del_rcu, but head->current_path is not clear to NULL by
nvme_mpath_clear_current_path.
Find path is race with nvme_ns_remove, use the "old" pass to
nvme_round_robin_path to find path.


Ah. So this should be better:

@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)

 static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
struct nvme_ns *ns)
 {
-   ns = list_next_or_null_rcu(>list, >siblings, struct 
nvme_ns,

-   siblings);
-   if (ns)
-   return ns;
+   if (ns && !test_bit(NVME_NS_REMOVING, >flags)) {
+   ns = list_next_or_null_rcu(>list, >siblings,
+  struct nvme_ns, siblings);
+   if (ns)
+   return ns;
+   }
return list_first_or_null_rcu(>list, struct nvme_ns, 
siblings);

 }

The 'NVME_NS_REMOVING' bit is set before list_del_rcu() is called, so it 
should guard against the issue you mentioned.


Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Chao Leng




On 2021/2/1 16:57, Hannes Reinecke wrote:

On 2/1/21 9:47 AM, Chao Leng wrote:



On 2021/2/1 15:29, Hannes Reinecke wrote:[ .. ]

Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the current 
(and my fixed) versions is?
I'm still not clear where the problem is once we applied both patches.

For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:
head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;


And this is where I have issues with.
Where does 'old' come from?
Clearly it was part of the list at one point; so what happened to it?

I explained this earlier.
In nvme_ns_remove, there is a hole between list_del_rcu and
nvme_mpath_clear_current_path. If head->current_path is the "old", and
the "old" is removing. The "old" is already removed from the list by
list_del_rcu, but head->current_path is not clear to NULL by
nvme_mpath_clear_current_path.
Find path is race with nvme_ns_remove, use the "old" pass to
nvme_round_robin_path to find path.


Cheers,

Hannes


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Hannes Reinecke

On 2/1/21 9:47 AM, Chao Leng wrote:



On 2021/2/1 15:29, Hannes Reinecke wrote:[ .. ]

Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the 
current (and my fixed) versions is?

I'm still not clear where the problem is once we applied both patches.
For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:

head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;


And this is where I have issues with.
Where does 'old' come from?
Clearly it was part of the list at one point; so what happened to it?

Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-02-01 Thread Chao Leng




On 2021/2/1 15:29, Hannes Reinecke wrote:

On 2/1/21 3:16 AM, Chao Leng wrote:



On 2021/1/29 17:20, Hannes Reinecke wrote:

On 1/29/21 9:46 AM, Chao Leng wrote:



On 2021/1/29 16:33, Hannes Reinecke wrote:

On 1/29/21 8:45 AM, Chao Leng wrote:



On 2021/1/29 15:06, Hannes Reinecke wrote:

On 1/29/21 4:07 AM, Chao Leng wrote:



On 2021/1/29 9:42, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;


Where does 'old' pointing to?


This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


No. nvme_next_ns() will return NULL.

If there is just one path(the "old") and the "old" is deleted,
nvme_next_ns() will return NULL.
The list like this:
head->next = head;
old->next = head;
If there is two or more path and the "old" is deleted,
"for" will be infinite loop. because nvme_next_ns() will return
the path which in the list except the "old", check condition will
be true for ever.


But that will be caught by the statement above:

if (list_is_singular(>list))

no?

Two path just a sample example.
If there is just two path, will enter it, may cause no path but there is
actually one path. It is falsely assumed that the "old" must be not deleted.
If there is more than two path, will cause infinite loop.

So you mean we'll need something like this?

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 71696819c228..8ffccaf9c19a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
  static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
 struct nvme_ns *ns)
  {
-   ns = list_next_or_null_rcu(>list, >siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
+   if (ns) {
+   ns = list_next_or_null_rcu(>list, >siblings,
+  struct nvme_ns, siblings);
+   if (ns)
+   return ns;
+   }

No, in the scenario, ns should not be NULL.


Why not? 'ns == NULL' is precisely the corner-case this is trying to fix...


May be we can do like this:

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 282b7a4ea9a9..b895011a2cbd 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -199,30 +199,24 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
 return found;
  }

-static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
-   struct nvme_ns *ns)
-{
-   ns = list_next_or_null_rcu(>list, >siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
-   return list_first_or_null_rcu(>list, struct nvme_ns, siblings);
-}
+#define nvme_next_ns_condition(head, current, condition) \
+({ \
+   struct nvme_ns *__ptr = list_next_or_null_rcu(&(head)->list, \
+   &(current)->siblings, struct nvme_ns, siblings); \
+   __ptr ? __ptr : (condition) ? (condition) = false, \
+   list_first_or_null_rcu(&(head)->list, struct nvme_ns, \
+   siblings) : NULL; \
+})


Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the current 
(and my fixed) versions is?
I'm still not clear where the problem is once we applied both patches.

For example assume the list has three path, and all path is not 
NVME_ANA_OPTIMIZED:
head->next = ns1;
ns1->next = ns2;
ns2->next = head;
old->next = ns2;

My patch work flow:
nvme_next_ns_condition(head, old, true) return ns2;

Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-31 Thread Hannes Reinecke

On 2/1/21 3:16 AM, Chao Leng wrote:



On 2021/1/29 17:20, Hannes Reinecke wrote:

On 1/29/21 9:46 AM, Chao Leng wrote:



On 2021/1/29 16:33, Hannes Reinecke wrote:

On 1/29/21 8:45 AM, Chao Leng wrote:



On 2021/1/29 15:06, Hannes Reinecke wrote:

On 1/29/21 4:07 AM, Chao Leng wrote:



On 2021/1/29 9:42, Sagi Grimberg wrote:


You can't see exactly where it dies but I followed the 
assembly to
nvme_round_robin_path(). Maybe it's not the initial 
nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL 
eventually

(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I 
find 2 bugs

may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/ 


Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted 
from the

"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by 
synchronizing

the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;


Where does 'old' pointing to?


This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


No. nvme_next_ns() will return NULL.

If there is just one path(the "old") and the "old" is deleted,
nvme_next_ns() will return NULL.
The list like this:
head->next = head;
old->next = head;
If there is two or more path and the "old" is deleted,
"for" will be infinite loop. because nvme_next_ns() will return
the path which in the list except the "old", check condition will
be true for ever.


But that will be caught by the statement above:

if (list_is_singular(>list))

no?

Two path just a sample example.
If there is just two path, will enter it, may cause no path but there is
actually one path. It is falsely assumed that the "old" must be not 
deleted.

If there is more than two path, will cause infinite loop.

So you mean we'll need something like this?

diff --git a/drivers/nvme/host/multipath.c 
b/drivers/nvme/host/multipath.c

index 71696819c228..8ffccaf9c19a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)

  static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
 struct nvme_ns *ns)
  {
-   ns = list_next_or_null_rcu(>list, >siblings, struct 
nvme_ns,

-   siblings);
-   if (ns)
-   return ns;
+   if (ns) {
+   ns = list_next_or_null_rcu(>list, >siblings,
+  struct nvme_ns, siblings);
+   if (ns)
+   return ns;
+   }

No, in the scenario, ns should not be NULL.


Why not? 'ns == NULL' is precisely the corner-case this is trying to fix...


May be we can do like this:

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 282b7a4ea9a9..b895011a2cbd 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -199,30 +199,24 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)

     return found;
  }

-static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
-   struct nvme_ns *ns)
-{
-   ns = list_next_or_null_rcu(>list, >siblings, struct 
nvme_ns,

-   siblings);
-   if (ns)
-   return ns;
-   return list_first_or_null_rcu(>list, struct nvme_ns, 
siblings);

-}
+#define nvme_next_ns_condition(head, current, condition) \
+({ \
+   struct nvme_ns *__ptr = list_next_or_null_rcu(&(head)->list, \
+   &(current)->siblings, struct nvme_ns, siblings); \
+   __ptr ? __ptr : (condition) ? (condition) = false, \
+   list_first_or_null_rcu(&(head)->list, struct nvme_ns, \
+   siblings) : NULL; \
+})


Urgh. Please, no. That is well impossible to debug.
Can you please open-code it to demonstrate where the difference to the 
current (and my fixed) versions is?

I'm still not clear where the problem is once we applied both patches.

Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-31 Thread Chao Leng




On 2021/1/29 17:20, Hannes Reinecke wrote:

On 1/29/21 9:46 AM, Chao Leng wrote:



On 2021/1/29 16:33, Hannes Reinecke wrote:

On 1/29/21 8:45 AM, Chao Leng wrote:



On 2021/1/29 15:06, Hannes Reinecke wrote:

On 1/29/21 4:07 AM, Chao Leng wrote:



On 2021/1/29 9:42, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;


Where does 'old' pointing to?


This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


No. nvme_next_ns() will return NULL.

If there is just one path(the "old") and the "old" is deleted,
nvme_next_ns() will return NULL.
The list like this:
head->next = head;
old->next = head;
If there is two or more path and the "old" is deleted,
"for" will be infinite loop. because nvme_next_ns() will return
the path which in the list except the "old", check condition will
be true for ever.


But that will be caught by the statement above:

if (list_is_singular(>list))

no?

Two path just a sample example.
If there is just two path, will enter it, may cause no path but there is
actually one path. It is falsely assumed that the "old" must be not deleted.
If there is more than two path, will cause infinite loop.

So you mean we'll need something like this?

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 71696819c228..8ffccaf9c19a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
  static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
     struct nvme_ns *ns)
  {
-   ns = list_next_or_null_rcu(>list, >siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
+   if (ns) {
+   ns = list_next_or_null_rcu(>list, >siblings,
+  struct nvme_ns, siblings);
+   if (ns)
+   return ns;
+   }

No, in the scenario, ns should not be NULL.
May be we can do like this:

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 282b7a4ea9a9..b895011a2cbd 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -199,30 +199,24 @@ static struct nvme_ns *__nvme_find_path(struct 
nvme_ns_head *head, int node)
return found;
 }

-static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
-   struct nvme_ns *ns)
-{
-   ns = list_next_or_null_rcu(>list, >siblings, struct nvme_ns,
-   siblings);
-   if (ns)
-   return ns;
-   return list_first_or_null_rcu(>list, struct nvme_ns, siblings);
-}
+#define nvme_next_ns_condition(head, current, condition) \
+({ \
+   struct nvme_ns *__ptr = list_next_or_null_rcu(&(head)->list, \
+   &(current)->siblings, struct nvme_ns, siblings); \
+   __ptr ? __ptr : (condition) ? (condition) = false, \
+   list_first_or_null_rcu(&(head)->list, struct nvme_ns, \
+   siblings) : NULL; \
+})

 static struct nvme_ns *nvme_round_robin_path(struct nvme_ns_head *head,
int node, struct nvme_ns *old)
 {
struct nvme_ns *ns, *found = NULL;
+   bool first_half = true;

-   if (list_is_singular(>list)) {
-   if (nvme_path_is_disabled(old))
-   return NULL;
-   return old;
-   }
-
-   for (ns = nvme_next_ns(head, old);
+   for (ns = nvme_next_ns_condition(head, old, first_half);
 ns && ns != old;
-ns = nvme_next_ns(head, ns)) {
+ns = nvme_next_ns_condition(head, ns, first_half)) {
if 

Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Hannes Reinecke

On 1/29/21 4:07 AM, Chao Leng wrote:



On 2021/1/29 9:42, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/ 


Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;


Where does 'old' pointing to?


This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


No. nvme_next_ns() will return NULL.

Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Chao Leng




On 2021/1/29 11:30, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;
This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


Who is being removed? I'm not following

The "old" is being removed path.
Daniel Wagner report crash like this:
head->next = head;
old->next = head;
So nvme_next_ns(head, old) will return NULL, and then crash.
Although check ns can avoid crash, but can not avoid infinite loop.

Similar reason, The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;
ns1 is other path.


.


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Sagi Grimberg




You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/ 


Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;
This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
 ns != old;
 ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.


Who is being removed? I'm not following


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Chao Leng




On 2021/1/29 9:42, Sagi Grimberg wrote:



You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

ok, I see.


nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.

The list will be like this:
head->next = ns1;
ns1->next = head;
old->next = ns1;
This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
ns != old;
ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.

.


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Sagi Grimberg




You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/ 


Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).

nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Chao Leng




On 2021/1/28 17:40, Daniel Wagner wrote:

On Thu, Jan 28, 2021 at 05:18:56PM +0800, Chao Leng wrote:

It is impossible to return NULL for nvme_next_ns(head, old).


  block nvme0n1: no usable path - requeuing I/O
  block nvme0n1: no usable path - requeuing I/O
  block nvme0n1: no usable path - requeuing I/O
  block nvme0n1: no usable path - requeuing I/O
  block nvme0n1: no usable path - requeuing I/O
  block nvme0n1: no usable path - requeuing I/O
  BUG: kernel NULL pointer dereference, address: 0068
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x) - not-present page
  PGD 800ff67bc067 P4D 800ff67bc067 PUD ff9ac9067 PMD 0
  Oops:  [#1] SMP PTI
  CPU: 23 PID: 15759 Comm: dt.21.15 Kdump: loaded Tainted: GE   
5.3.18-0.gc9fe679-default #1 SLE15-SP2 (unreleased)
  Hardware name: FUJITSU PRIMERGY RX2540 M2/D3289-B1, BIOS V5.0.0.11 R1.18.0 
for D3289-B1x02/06/2018
  RIP: 0010:nvme_ns_head_make_request+0x1d1/0x430 [nvme_core]
  Code: 54 24 10 0f 84 c9 01 00 00 48 8b 54 24 10 48 83 ea 30 0f 84 ba 01 00 00 48 39 
d0 0f 84 01 02 00 00 31 ff eb 05 48 39 d0 74 67 <48> 8b 72 68 83 e6 04 75 13 48 
8b 72 68 83 e6 01 75 0a 48 8b 72 10
  RSP: 0018:a69d08017af8 EFLAGS: 00010246
  RAX: 92f261d87800 RBX: 92fa555b0010 RCX: 92fa555bc570
  RDX:  RSI:  RDI: 
  RBP: 0001 R08: 1000 R09: 1000
  R10: a69d080179a8 R11: 92f264f0c1c0 R12: 92f264f7f000
  R13: 92fa555b R14: 0001 R15: 
  FS:  7f3962bae700() GS:92f29ffc() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: 0068 CR3: 000fd69a2002 CR4: 003606e0
  DR0:  DR1:  DR2: 
  DR3:  DR6: fffe0ff0 DR7: 0400
  Call Trace:
   generic_make_request+0x121/0x300
   ? submit_bio+0x42/0x1c0
   submit_bio+0x42/0x1c0
   ext4_io_submit+0x49/0x60 [ext4]
   ext4_writepages+0x625/0xe90 [ext4]
   ? do_writepages+0x4b/0xe0
   ? ext4_mark_inode_dirty+0x1d0/0x1d0 [ext4]
   do_writepages+0x4b/0xe0
   ? __generic_file_write_iter+0x192/0x1c0
   ? __filemap_fdatawrite_range+0xcb/0x100
   __filemap_fdatawrite_range+0xcb/0x100
   ? ext4_file_write_iter+0x128/0x3c0 [ext4]
   file_write_and_wait_range+0x5e/0xb0
   __generic_file_fsync+0x22/0xb0
   ext4_sync_file+0x1f7/0x3c0 [ext4]
   do_fsync+0x38/0x60
   __x64_sys_fsync+0x10/0x20
   do_syscall_64+0x5b/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9

You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

So there is other bug cause nvme_next_ns abormal.
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


And I have positive feedback, this patch fixes the above problem.

Although adding check ns can fix the crash. There may still be other problems
such as in __nvme_find_path which use list_for_each_entry_rcu.


.



Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Chao Leng




On 2021/1/28 17:23, Hannes Reinecke wrote:

On 1/28/21 10:18 AM, Chao Leng wrote:



On 2021/1/28 15:58, Daniel Wagner wrote:

On Thu, Jan 28, 2021 at 09:31:30AM +0800, Chao Leng wrote:

--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
nvme_ns_head *head,
   }
   for (ns = nvme_next_ns(head, old);
- ns != old;
+ ns && ns != old;

nvme_round_robin_path just be called when !"old".
nvme_next_ns should not return NULL when !"old".
It seems unnecessary to add checking "ns".


The problem is when we enter nvme_round_robin_path() and there is no
path available. In this case the initialization ns = nvme_next_ns(head,
old) could return a NULL pointer."old" should not be NULL, so there is at least one path 
that is "old".

It is impossible to return NULL for nvme_next_ns(head, old).


No. list_next_or_null_rcu()/list_first_or_null_rcu() will return NULL when then 
end of the list is reached.

Although list_next_or_null_rcu()/list_first_or_null_rcu() may return
NULL, but nvme_next_ns(head, old) assume that the "old" is in the "head",
so nvme_next_ns(head, old) should not return NULL. If the "old" is not
in the "head", nvme_next_ns(head, old) will run abnormal.
So there is other bug which cause nvme_next_ns(head, old).

I review the code about head->list and head->current_path, I find 2 bugs
may cause nvme_next_ns(head, old) abnormal:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengc...@huawei.com/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.


Cheers,

Hannes


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Daniel Wagner
On Thu, Jan 28, 2021 at 05:18:56PM +0800, Chao Leng wrote:
> It is impossible to return NULL for nvme_next_ns(head, old).

 block nvme0n1: no usable path - requeuing I/O
 block nvme0n1: no usable path - requeuing I/O
 block nvme0n1: no usable path - requeuing I/O
 block nvme0n1: no usable path - requeuing I/O
 block nvme0n1: no usable path - requeuing I/O
 block nvme0n1: no usable path - requeuing I/O
 BUG: kernel NULL pointer dereference, address: 0068
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x) - not-present page
 PGD 800ff67bc067 P4D 800ff67bc067 PUD ff9ac9067 PMD 0 
 Oops:  [#1] SMP PTI
 CPU: 23 PID: 15759 Comm: dt.21.15 Kdump: loaded Tainted: GE   
5.3.18-0.gc9fe679-default #1 SLE15-SP2 (unreleased)
 Hardware name: FUJITSU PRIMERGY RX2540 M2/D3289-B1, BIOS V5.0.0.11 R1.18.0 for 
D3289-B1x02/06/2018
 RIP: 0010:nvme_ns_head_make_request+0x1d1/0x430 [nvme_core]
 Code: 54 24 10 0f 84 c9 01 00 00 48 8b 54 24 10 48 83 ea 30 0f 84 ba 01 00 00 
48 39 d0 0f 84 01 02 00 00 31 ff eb 05 48 39 d0 74 67 <48> 8b 72 68 83 e6 04 75 
13 48 8b 72 68 83 e6 01 75 0a 48 8b 72 10
 RSP: 0018:a69d08017af8 EFLAGS: 00010246
 RAX: 92f261d87800 RBX: 92fa555b0010 RCX: 92fa555bc570
 RDX:  RSI:  RDI: 
 RBP: 0001 R08: 1000 R09: 1000
 R10: a69d080179a8 R11: 92f264f0c1c0 R12: 92f264f7f000
 R13: 92fa555b R14: 0001 R15: 
 FS:  7f3962bae700() GS:92f29ffc() knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: 0068 CR3: 000fd69a2002 CR4: 003606e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: fffe0ff0 DR7: 0400
 Call Trace:
  generic_make_request+0x121/0x300
  ? submit_bio+0x42/0x1c0
  submit_bio+0x42/0x1c0
  ext4_io_submit+0x49/0x60 [ext4]
  ext4_writepages+0x625/0xe90 [ext4]
  ? do_writepages+0x4b/0xe0
  ? ext4_mark_inode_dirty+0x1d0/0x1d0 [ext4]
  do_writepages+0x4b/0xe0
  ? __generic_file_write_iter+0x192/0x1c0
  ? __filemap_fdatawrite_range+0xcb/0x100
  __filemap_fdatawrite_range+0xcb/0x100
  ? ext4_file_write_iter+0x128/0x3c0 [ext4]
  file_write_and_wait_range+0x5e/0xb0
  __generic_file_fsync+0x22/0xb0
  ext4_sync_file+0x1f7/0x3c0 [ext4]
  do_fsync+0x38/0x60
  __x64_sys_fsync+0x10/0x20
  do_syscall_64+0x5b/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9

You can't see exactly where it dies but I followed the assembly to
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).

And I have positive feedback, this patch fixes the above problem.



Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Hannes Reinecke

On 1/28/21 10:18 AM, Chao Leng wrote:



On 2021/1/28 15:58, Daniel Wagner wrote:

On Thu, Jan 28, 2021 at 09:31:30AM +0800, Chao Leng wrote:

--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns 
*nvme_round_robin_path(struct nvme_ns_head *head,

   }
   for (ns = nvme_next_ns(head, old);
- ns != old;
+ ns && ns != old;

nvme_round_robin_path just be called when !"old".
nvme_next_ns should not return NULL when !"old".
It seems unnecessary to add checking "ns".


The problem is when we enter nvme_round_robin_path() and there is no
path available. In this case the initialization ns = nvme_next_ns(head,
old) could return a NULL pointer."old" should not be NULL, so there is 
at least one path that is "old".

It is impossible to return NULL for nvme_next_ns(head, old).


No. list_next_or_null_rcu()/list_first_or_null_rcu() will return NULL 
when then end of the list is reached.


Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Chao Leng




On 2021/1/28 15:58, Daniel Wagner wrote:

On Thu, Jan 28, 2021 at 09:31:30AM +0800, Chao Leng wrote:

--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
nvme_ns_head *head,
}
for (ns = nvme_next_ns(head, old);
-ns != old;
+ns && ns != old;

nvme_round_robin_path just be called when !"old".
nvme_next_ns should not return NULL when !"old".
It seems unnecessary to add checking "ns".


The problem is when we enter nvme_round_robin_path() and there is no
path available. In this case the initialization ns = nvme_next_ns(head,
old) could return a NULL pointer."old" should not be NULL, so there is at least one path 
that is "old".

It is impossible to return NULL for nvme_next_ns(head, old).

.



Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-28 Thread Daniel Wagner
On Thu, Jan 28, 2021 at 09:31:30AM +0800, Chao Leng wrote:
> > --- a/drivers/nvme/host/multipath.c
> > +++ b/drivers/nvme/host/multipath.c
> > @@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
> > nvme_ns_head *head,
> > }
> > for (ns = nvme_next_ns(head, old);
> > -ns != old;
> > +ns && ns != old;
> nvme_round_robin_path just be called when !"old".
> nvme_next_ns should not return NULL when !"old".
> It seems unnecessary to add checking "ns".

The problem is when we enter nvme_round_robin_path() and there is no
path available. In this case the initialization ns = nvme_next_ns(head,
old) could return a NULL pointer.


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-27 Thread Chao Leng




On 2021/1/27 18:30, Daniel Wagner wrote:

nvme_round_robin_path() should test if the return ns pointer is
valid. nvme_next_ns() will return a NULL pointer if there is no path
left.

Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Cc: Hannes Reinecke 
Signed-off-by: Daniel Wagner 
---
v2:
   - moved NULL test into the if conditional statement
   - added Fixes tag

  drivers/nvme/host/multipath.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 9ac762b28811..282b7a4ea9a9 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
nvme_ns_head *head,
}
  
  	for (ns = nvme_next_ns(head, old);

-ns != old;
+ns && ns != old;

nvme_round_robin_path just be called when !"old".
nvme_next_ns should not return NULL when !"old".
It seems unnecessary to add checking "ns".
Is there a bug that "old" is not in "head" list?
If yes, we should fix it.

 ns = nvme_next_ns(head, ns)) {
if (nvme_path_is_disabled(ns))
continue;



Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-27 Thread Chao Leng




On 2021/1/27 18:30, Daniel Wagner wrote:

nvme_round_robin_path() should test if the return ns pointer is
valid. nvme_next_ns() will return a NULL pointer if there is no path
left.

Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Cc: Hannes Reinecke 
Signed-off-by: Daniel Wagner 
---
v2:
   - moved NULL test into the if conditional statement
   - added Fixes tag

  drivers/nvme/host/multipath.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 9ac762b28811..282b7a4ea9a9 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
nvme_ns_head *head,
}
  
  	for (ns = nvme_next_ns(head, old);

-ns != old;
+ns && ns != old;

nvme_round_robin_path just be called when !"old".
nvme_next_ns should not return NULL when !"old".
It seems unnecessary to add checking "ns".
Is there a bug that "old" in not in "head" list?
If yes, we should fix it.

 ns = nvme_next_ns(head, ns)) {
if (nvme_path_is_disabled(ns))
continue;



Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-27 Thread Christoph Hellwig
Thanks,

applied to nvme-5.11.


Re: [PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-27 Thread Hannes Reinecke

On 1/27/21 11:30 AM, Daniel Wagner wrote:

nvme_round_robin_path() should test if the return ns pointer is
valid. nvme_next_ns() will return a NULL pointer if there is no path
left.

Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Cc: Hannes Reinecke 
Signed-off-by: Daniel Wagner 
---
v2:
   - moved NULL test into the if conditional statement
   - added Fixes tag

  drivers/nvme/host/multipath.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 9ac762b28811..282b7a4ea9a9 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
nvme_ns_head *head,
}
  
  	for (ns = nvme_next_ns(head, old);

-ns != old;
+ns && ns != old;
 ns = nvme_next_ns(head, ns)) {
if (nvme_path_is_disabled(ns))
continue;


Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
--
Dr. Hannes ReineckeKernel Storage Architect
h...@suse.de  +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


[PATCH v2] nvme-multipath: Early exit if no path is available

2021-01-27 Thread Daniel Wagner
nvme_round_robin_path() should test if the return ns pointer is
valid. nvme_next_ns() will return a NULL pointer if there is no path
left.

Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Cc: Hannes Reinecke 
Signed-off-by: Daniel Wagner 
---
v2:
  - moved NULL test into the if conditional statement
  - added Fixes tag

 drivers/nvme/host/multipath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 9ac762b28811..282b7a4ea9a9 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -221,7 +221,7 @@ static struct nvme_ns *nvme_round_robin_path(struct 
nvme_ns_head *head,
}
 
for (ns = nvme_next_ns(head, old);
-ns != old;
+ns && ns != old;
 ns = nvme_next_ns(head, ns)) {
if (nvme_path_is_disabled(ns))
continue;
-- 
2.29.2