Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-22 Thread dingxiang


Hi,All

Hello,

On Mon, Jun 20, 2016 at 06:46:55PM -0700, Dan Williams wrote:

On Mon, Jun 20, 2016 at 6:22 PM, Martin K. Petersen
 wrote:

"Tejun" == Tejun Heo  writes:

In fact,we don't need libata to deal with hotplug in sas environment.
So we can't run ata hotplug task when ata port is sas host.

Tejun> Martin, can you please confirm whether the above is true.  If so,
Tejun> I'll route the patch through libata w/ stable cc'd.

Not exactly a libsas expert. James? Dan?

While it is true that libsas itself handles adding / removing devices
we have historically avoided this conflict because
ATA_PFLAG_SCSI_HOTPLUG is never set for libsas ata_ports.  So the bug
/ behavior change is that  ATA_PFLAG_SCSI_HOTPLUG gets set in the
first place.  Ignoring it is a band-aid / not the real fix afaics.

I see.  I'll hold off for now then.  Ding Xiang, can you find out
where that flag is getting set?

Thanks!

There are two places will set flag ATA_PFLAG_SCSI_HOTPLUG in libata-eh.c.
I think both places should  be protected. Here is the suggestion.
Thanks~

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 61dc7a9..2bee041 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1385,7 +1385,8 @@ void ata_eh_detach_dev(struct ata_device *dev)

if (ata_scsi_offline_dev(dev)) {
dev->flags |= ATA_DFLAG_DETACHED;
-   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
+   if (!(ap->pflags & ATA_FLAG_SAS_HOST))
+   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
}

/* clear per-dev EH info */
@@ -3299,7 +3300,8 @@ static int ata_eh_revalidate_and_attach(struct 
ata_link *link,

}

spin_lock_irqsave(ap->lock, flags);
-   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
+   if (!(ap->pflags & ATA_FLAG_SAS_HOST))
+   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
spin_unlock_irqrestore(ap->lock, flags);

/* new device discovered, configure xfermode */




Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-22 Thread dingxiang


Hi,All

Hello,

On Mon, Jun 20, 2016 at 06:46:55PM -0700, Dan Williams wrote:

On Mon, Jun 20, 2016 at 6:22 PM, Martin K. Petersen
 wrote:

"Tejun" == Tejun Heo  writes:

In fact,we don't need libata to deal with hotplug in sas environment.
So we can't run ata hotplug task when ata port is sas host.

Tejun> Martin, can you please confirm whether the above is true.  If so,
Tejun> I'll route the patch through libata w/ stable cc'd.

Not exactly a libsas expert. James? Dan?

While it is true that libsas itself handles adding / removing devices
we have historically avoided this conflict because
ATA_PFLAG_SCSI_HOTPLUG is never set for libsas ata_ports.  So the bug
/ behavior change is that  ATA_PFLAG_SCSI_HOTPLUG gets set in the
first place.  Ignoring it is a band-aid / not the real fix afaics.

I see.  I'll hold off for now then.  Ding Xiang, can you find out
where that flag is getting set?

Thanks!

There are two places will set flag ATA_PFLAG_SCSI_HOTPLUG in libata-eh.c.
I think both places should  be protected. Here is the suggestion.
Thanks~

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 61dc7a9..2bee041 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1385,7 +1385,8 @@ void ata_eh_detach_dev(struct ata_device *dev)

if (ata_scsi_offline_dev(dev)) {
dev->flags |= ATA_DFLAG_DETACHED;
-   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
+   if (!(ap->pflags & ATA_FLAG_SAS_HOST))
+   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
}

/* clear per-dev EH info */
@@ -3299,7 +3300,8 @@ static int ata_eh_revalidate_and_attach(struct 
ata_link *link,

}

spin_lock_irqsave(ap->lock, flags);
-   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
+   if (!(ap->pflags & ATA_FLAG_SAS_HOST))
+   ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
spin_unlock_irqrestore(ap->lock, flags);

/* new device discovered, configure xfermode */




Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-21 Thread Tejun Heo
Hello,

On Mon, Jun 20, 2016 at 06:46:55PM -0700, Dan Williams wrote:
> On Mon, Jun 20, 2016 at 6:22 PM, Martin K. Petersen
>  wrote:
> >> "Tejun" == Tejun Heo  writes:
> >
> >>> In fact,we don't need libata to deal with hotplug in sas environment.
> >>> So we can't run ata hotplug task when ata port is sas host.
> >
> > Tejun> Martin, can you please confirm whether the above is true.  If so,
> > Tejun> I'll route the patch through libata w/ stable cc'd.
> >
> > Not exactly a libsas expert. James? Dan?
> 
> While it is true that libsas itself handles adding / removing devices
> we have historically avoided this conflict because
> ATA_PFLAG_SCSI_HOTPLUG is never set for libsas ata_ports.  So the bug
> / behavior change is that  ATA_PFLAG_SCSI_HOTPLUG gets set in the
> first place.  Ignoring it is a band-aid / not the real fix afaics.

I see.  I'll hold off for now then.  Ding Xiang, can you find out
where that flag is getting set?

Thanks!

-- 
tejun


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-21 Thread Tejun Heo
Hello,

On Mon, Jun 20, 2016 at 06:46:55PM -0700, Dan Williams wrote:
> On Mon, Jun 20, 2016 at 6:22 PM, Martin K. Petersen
>  wrote:
> >> "Tejun" == Tejun Heo  writes:
> >
> >>> In fact,we don't need libata to deal with hotplug in sas environment.
> >>> So we can't run ata hotplug task when ata port is sas host.
> >
> > Tejun> Martin, can you please confirm whether the above is true.  If so,
> > Tejun> I'll route the patch through libata w/ stable cc'd.
> >
> > Not exactly a libsas expert. James? Dan?
> 
> While it is true that libsas itself handles adding / removing devices
> we have historically avoided this conflict because
> ATA_PFLAG_SCSI_HOTPLUG is never set for libsas ata_ports.  So the bug
> / behavior change is that  ATA_PFLAG_SCSI_HOTPLUG gets set in the
> first place.  Ignoring it is a band-aid / not the real fix afaics.

I see.  I'll hold off for now then.  Ding Xiang, can you find out
where that flag is getting set?

Thanks!

-- 
tejun


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-20 Thread Dan Williams
On Mon, Jun 20, 2016 at 6:22 PM, Martin K. Petersen
 wrote:
>> "Tejun" == Tejun Heo  writes:
>
>>> In fact,we don't need libata to deal with hotplug in sas environment.
>>> So we can't run ata hotplug task when ata port is sas host.
>
> Tejun> Martin, can you please confirm whether the above is true.  If so,
> Tejun> I'll route the patch through libata w/ stable cc'd.
>
> Not exactly a libsas expert. James? Dan?

While it is true that libsas itself handles adding / removing devices
we have historically avoided this conflict because
ATA_PFLAG_SCSI_HOTPLUG is never set for libsas ata_ports.  So the bug
/ behavior change is that  ATA_PFLAG_SCSI_HOTPLUG gets set in the
first place.  Ignoring it is a band-aid / not the real fix afaics.


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-20 Thread Dan Williams
On Mon, Jun 20, 2016 at 6:22 PM, Martin K. Petersen
 wrote:
>> "Tejun" == Tejun Heo  writes:
>
>>> In fact,we don't need libata to deal with hotplug in sas environment.
>>> So we can't run ata hotplug task when ata port is sas host.
>
> Tejun> Martin, can you please confirm whether the above is true.  If so,
> Tejun> I'll route the patch through libata w/ stable cc'd.
>
> Not exactly a libsas expert. James? Dan?

While it is true that libsas itself handles adding / removing devices
we have historically avoided this conflict because
ATA_PFLAG_SCSI_HOTPLUG is never set for libsas ata_ports.  So the bug
/ behavior change is that  ATA_PFLAG_SCSI_HOTPLUG gets set in the
first place.  Ignoring it is a band-aid / not the real fix afaics.


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-20 Thread Martin K. Petersen
> "Tejun" == Tejun Heo  writes:

>> In fact,we don't need libata to deal with hotplug in sas environment.
>> So we can't run ata hotplug task when ata port is sas host.

Tejun> Martin, can you please confirm whether the above is true.  If so,
Tejun> I'll route the patch through libata w/ stable cc'd.

Not exactly a libsas expert. James? Dan?

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-20 Thread Martin K. Petersen
> "Tejun" == Tejun Heo  writes:

>> In fact,we don't need libata to deal with hotplug in sas environment.
>> So we can't run ata hotplug task when ata port is sas host.

Tejun> Martin, can you please confirm whether the above is true.  If so,
Tejun> I'll route the patch through libata w/ stable cc'd.

Not exactly a libsas expert. James? Dan?

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-16 Thread Tejun Heo
On Thu, Jun 16, 2016 at 12:45:40PM +0800, DingXiang wrote:
...
> In fact,we don't need libata to deal with hotplug in sas environment.
> So we can't run ata hotplug task when ata port is sas host.

Martin, can you please confirm whether the above is true.  If so, I'll
route the patch through libata w/ stable cc'd.

Thanks.

-- 
tejun


Re: [PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-16 Thread Tejun Heo
On Thu, Jun 16, 2016 at 12:45:40PM +0800, DingXiang wrote:
...
> In fact,we don't need libata to deal with hotplug in sas environment.
> So we can't run ata hotplug task when ata port is sas host.

Martin, can you please confirm whether the above is true.  If so, I'll
route the patch through libata w/ stable cc'd.

Thanks.

-- 
tejun


[PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-15 Thread DingXiang
In normal condition,if we use sas protocol and hotplug a
sata disk on a port,the sas driver will send event
"PORTE_BYTES_DMAED" and call function "sas_porte_bytes_dmaed".
But if a sata disk is run io and unplug it,then plug a new
sata disk,this operation may cause a kernel panic like this:

[ 2366.923208] Unable to handle kernel NULL pointer dereference
at virtual address 07b8
...
[ 2368.766334] Call trace:
[ 2368.781712] [] sas_find_dev_by_rphy+0x48/0x118
[ 2368.800394] [] sas_target_alloc+0x28/0x98
[ 2368.817975] [] scsi_alloc_target+0x248/0x308
[ 2368.835570] [] __scsi_add_device+0xb8/0x160
[ 2368.853034] [] ata_scsi_scan_host+0x190/0x230
[ 2368.871614] [] ata_scsi_hotplug+0xc8/0xe8
[ 2368.889152] [] process_one_work+0x164/0x438
[ 2368.908003] [] worker_thread+0x144/0x4b0
[ 2368.924613] [] kthread+0xfc/0x110

This because "dev_to_shost" in "sas_find_dev_by_rphy" return
a NULL point,and SHOST_TO_SAS_HA used it,so kernel panic happened.

why did dev_to_shost return a NULL point?
Because in "__scsi_add_device" ,
struct device *parent = >shost_gendev,
and in "scsi_alloc_target", "*parent" is assigned to
"starget->dev.parent",then "sas_target_alloc" will get
"struct sas_rphy" according "starget->dev.parent", and in
"sas_find_dev_by_rphy" , we will get "struct Scsi_Host *shost"
according "rphy->dev.parent",we will find that
rphy->dev.parent = shost->shost_gendev.parent, and shost_gendev.parent
is "ap->tdev",there is no parent any more,so "dev_to_shost"
return a NULL point.

when the panic will happen?
When libata is handling error,and add hotplug_task to workqueue,
if a new sata disk pluged at this moment,the libata hotplug task
will run and panic will happen.

In fact,we don't need libata to deal with hotplug in sas environment.
So we can't run ata hotplug task when ata port is sas host.

Signed-off-by: Ding Xiang 
---
 drivers/ata/libata-eh.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 61dc7a9..4428a7c 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -816,7 +816,8 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, 
struct ata_port *ap)
 
if (ap->pflags & ATA_PFLAG_LOADING)
ap->pflags &= ~ATA_PFLAG_LOADING;
-   else if (ap->pflags & ATA_PFLAG_SCSI_HOTPLUG)
+   else if ((ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) &&
+!(ap->pflags & ATA_FLAG_SAS_HOST))
schedule_delayed_work(>hotplug_task, 0);
 
if (ap->pflags & ATA_PFLAG_RECOVERED)
-- 
2.5.0



[PATCH V2 resend] libata:fix kernel panic when hotplug

2016-06-15 Thread DingXiang
In normal condition,if we use sas protocol and hotplug a
sata disk on a port,the sas driver will send event
"PORTE_BYTES_DMAED" and call function "sas_porte_bytes_dmaed".
But if a sata disk is run io and unplug it,then plug a new
sata disk,this operation may cause a kernel panic like this:

[ 2366.923208] Unable to handle kernel NULL pointer dereference
at virtual address 07b8
...
[ 2368.766334] Call trace:
[ 2368.781712] [] sas_find_dev_by_rphy+0x48/0x118
[ 2368.800394] [] sas_target_alloc+0x28/0x98
[ 2368.817975] [] scsi_alloc_target+0x248/0x308
[ 2368.835570] [] __scsi_add_device+0xb8/0x160
[ 2368.853034] [] ata_scsi_scan_host+0x190/0x230
[ 2368.871614] [] ata_scsi_hotplug+0xc8/0xe8
[ 2368.889152] [] process_one_work+0x164/0x438
[ 2368.908003] [] worker_thread+0x144/0x4b0
[ 2368.924613] [] kthread+0xfc/0x110

This because "dev_to_shost" in "sas_find_dev_by_rphy" return
a NULL point,and SHOST_TO_SAS_HA used it,so kernel panic happened.

why did dev_to_shost return a NULL point?
Because in "__scsi_add_device" ,
struct device *parent = >shost_gendev,
and in "scsi_alloc_target", "*parent" is assigned to
"starget->dev.parent",then "sas_target_alloc" will get
"struct sas_rphy" according "starget->dev.parent", and in
"sas_find_dev_by_rphy" , we will get "struct Scsi_Host *shost"
according "rphy->dev.parent",we will find that
rphy->dev.parent = shost->shost_gendev.parent, and shost_gendev.parent
is "ap->tdev",there is no parent any more,so "dev_to_shost"
return a NULL point.

when the panic will happen?
When libata is handling error,and add hotplug_task to workqueue,
if a new sata disk pluged at this moment,the libata hotplug task
will run and panic will happen.

In fact,we don't need libata to deal with hotplug in sas environment.
So we can't run ata hotplug task when ata port is sas host.

Signed-off-by: Ding Xiang 
---
 drivers/ata/libata-eh.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 61dc7a9..4428a7c 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -816,7 +816,8 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, 
struct ata_port *ap)
 
if (ap->pflags & ATA_PFLAG_LOADING)
ap->pflags &= ~ATA_PFLAG_LOADING;
-   else if (ap->pflags & ATA_PFLAG_SCSI_HOTPLUG)
+   else if ((ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) &&
+!(ap->pflags & ATA_FLAG_SAS_HOST))
schedule_delayed_work(>hotplug_task, 0);
 
if (ap->pflags & ATA_PFLAG_RECOVERED)
-- 
2.5.0