Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-09-07 Thread Christoph Hellwig
Hi Sergio,

sorry for the delay.  I'm fighting deadlines at the moment, so even if
this is my highest spare time priority right now I can't find enough
quiet time to dig into the problem.  I'd suggest you resend a report
to linux-ide (with a Cc to linux-scsi) so that the people who know the
libata code can take look.  I strongly suspect the root cause for it is
in libata anyway, which is a code base I need to get up to speed with
first.  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-09-07 Thread Sergio Callegari

Hi Christoph (and those reading!),

I wonder if there might be any update or, most important, anything else for me 
to test in order to provide info to address this issue...


I really would like to find out whether it is a bug in my hardware (which would 
be OK, as I know already how to work around it) or something that may happen on 
any hardware as soon as an unfortunate combination of storage equipment is adopted.


Thanks for the help so far,

Best regards,

Sergio

On 30/08/2015 12:54, Sergio Callegari wrote:

Hi Christoph,

just checked.

Unfortunately, the patch below, applied on top of Linus' v3.17 (which I am 
using as a test kernel) *does not fix the issue*.


Best regards,

Sergio

On 25/08/2015 13:00, Christoph Hellwig wrote:

Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
  return 1;
  }
  +static void ata_qc_done(struct ata_queued_cmd *qc)
+{
+struct scsi_cmnd *cmd = qc->scsicmd;
+void (*done)(struct scsi_cmnd *) = qc->scsidone;
+
+ata_qc_free(qc);
+done(cmd);
+}
+
  static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
  {
  struct ata_port *ap = qc->ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)

  if (need_sense && !ap->ops->error_handler)
  ata_dump_status(ap->print_id, >result_tf);
  -qc->scsidone(cmd);
-
-ata_qc_free(qc);
+ata_qc_done(qc);
  }
/**
@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)

  ata_gen_passthru_sense(qc);
  }
  -qc->scsidone(qc->scsicmd);
-ata_qc_free(qc);
+ata_qc_done(qc);
  }
/* is it pointless to prefer PIO for "safety reasons"? */
@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
  qc->dev->sdev->locked = 0;
qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;
-qc->scsidone(cmd);
-ata_qc_free(qc);
+ata_qc_done(qc);
  return;
  }
  @@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
  cmd->result = SAM_STAT_GOOD;
  }
  -qc->scsidone(cmd);
-ata_qc_free(qc);
+ata_qc_done(qc);
  }
  /**
   *atapi_xlat - Initialize PACKET taskfile




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-09-07 Thread Sergio Callegari

Hi Christoph (and those reading!),

I wonder if there might be any update or, most important, anything else for me 
to test in order to provide info to address this issue...


I really would like to find out whether it is a bug in my hardware (which would 
be OK, as I know already how to work around it) or something that may happen on 
any hardware as soon as an unfortunate combination of storage equipment is adopted.


Thanks for the help so far,

Best regards,

Sergio

On 30/08/2015 12:54, Sergio Callegari wrote:

Hi Christoph,

just checked.

Unfortunately, the patch below, applied on top of Linus' v3.17 (which I am 
using as a test kernel) *does not fix the issue*.


Best regards,

Sergio

On 25/08/2015 13:00, Christoph Hellwig wrote:

Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
  return 1;
  }
  +static void ata_qc_done(struct ata_queued_cmd *qc)
+{
+struct scsi_cmnd *cmd = qc->scsicmd;
+void (*done)(struct scsi_cmnd *) = qc->scsidone;
+
+ata_qc_free(qc);
+done(cmd);
+}
+
  static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
  {
  struct ata_port *ap = qc->ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)

  if (need_sense && !ap->ops->error_handler)
  ata_dump_status(ap->print_id, >result_tf);
  -qc->scsidone(cmd);
-
-ata_qc_free(qc);
+ata_qc_done(qc);
  }
/**
@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)

  ata_gen_passthru_sense(qc);
  }
  -qc->scsidone(qc->scsicmd);
-ata_qc_free(qc);
+ata_qc_done(qc);
  }
/* is it pointless to prefer PIO for "safety reasons"? */
@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
  qc->dev->sdev->locked = 0;
qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;
-qc->scsidone(cmd);
-ata_qc_free(qc);
+ata_qc_done(qc);
  return;
  }
  @@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
  cmd->result = SAM_STAT_GOOD;
  }
  -qc->scsidone(cmd);
-ata_qc_free(qc);
+ata_qc_done(qc);
  }
  /**
   *atapi_xlat - Initialize PACKET taskfile




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-09-07 Thread Christoph Hellwig
Hi Sergio,

sorry for the delay.  I'm fighting deadlines at the moment, so even if
this is my highest spare time priority right now I can't find enough
quiet time to dig into the problem.  I'd suggest you resend a report
to linux-ide (with a Cc to linux-scsi) so that the people who know the
libata code can take look.  I strongly suspect the root cause for it is
in libata anyway, which is a code base I need to get up to speed with
first.  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-08-30 Thread Sergio Callegari

Hi Christoph,

just checked.

Unfortunately, the patch below, applied on top of Linus' v3.17 (which I 
am using as a test kernel) *does not fix the issue*.


Best regards,

Sergio

On 25/08/2015 13:00, Christoph Hellwig wrote:

Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
return 1;
  }
  
+static void ata_qc_done(struct ata_queued_cmd *qc)

+{
+   struct scsi_cmnd *cmd = qc->scsicmd;
+   void (*done)(struct scsi_cmnd *) = qc->scsidone;
+
+   ata_qc_free(qc);
+   done(cmd);
+}
+
  static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
  {
struct ata_port *ap = qc->ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
if (need_sense && !ap->ops->error_handler)
ata_dump_status(ap->print_id, >result_tf);
  
-	qc->scsidone(cmd);

-
-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /**

@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)
ata_gen_passthru_sense(qc);
}
  
-	qc->scsidone(qc->scsicmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /* is it pointless to prefer PIO for "safety reasons"? */

@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
qc->dev->sdev->locked = 0;
  
  		qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;

-   qc->scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
return;
}
  
@@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)

cmd->result = SAM_STAT_GOOD;
}
  
-	qc->scsidone(cmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  /**
   *atapi_xlat - Initialize PACKET taskfile


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: scsi: convert host_busy to atomic_t series causes regressions for some hardware configurations

2015-08-30 Thread Sergio Callegari

Hi Christoph,

just checked.

Unfortunately, the patch below, applied on top of Linus' v3.17 (which I 
am using as a test kernel) *does not fix the issue*.


Best regards,

Sergio

On 25/08/2015 13:00, Christoph Hellwig wrote:

Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
return 1;
  }
  
+static void ata_qc_done(struct ata_queued_cmd *qc)

+{
+   struct scsi_cmnd *cmd = qc-scsicmd;
+   void (*done)(struct scsi_cmnd *) = qc-scsidone;
+
+   ata_qc_free(qc);
+   done(cmd);
+}
+
  static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
  {
struct ata_port *ap = qc-ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
if (need_sense  !ap-ops-error_handler)
ata_dump_status(ap-print_id, qc-result_tf);
  
-	qc-scsidone(cmd);

-
-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /**

@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)
ata_gen_passthru_sense(qc);
}
  
-	qc-scsidone(qc-scsicmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /* is it pointless to prefer PIO for safety reasons? */

@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
qc-dev-sdev-locked = 0;
  
  		qc-scsicmd-result = SAM_STAT_CHECK_CONDITION;

-   qc-scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
return;
}
  
@@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)

cmd-result = SAM_STAT_GOOD;
}
  
-	qc-scsidone(cmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  /**
   *atapi_xlat - Initialize PACKET taskfile


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-08-26 Thread Sergio Callegari

Sure, thanks!

I'll test this weekend.

Best,

Sergio

On 25/08/2015 13:00, Christoph Hellwig wrote:

Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
return 1;
  }
  
+static void ata_qc_done(struct ata_queued_cmd *qc)

+{
+   struct scsi_cmnd *cmd = qc->scsicmd;
+   void (*done)(struct scsi_cmnd *) = qc->scsidone;
+
+   ata_qc_free(qc);
+   done(cmd);
+}
+
  static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
  {
struct ata_port *ap = qc->ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
if (need_sense && !ap->ops->error_handler)
ata_dump_status(ap->print_id, >result_tf);
  
-	qc->scsidone(cmd);

-
-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /**

@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)
ata_gen_passthru_sense(qc);
}
  
-	qc->scsidone(qc->scsicmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /* is it pointless to prefer PIO for "safety reasons"? */

@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
qc->dev->sdev->locked = 0;
  
  		qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;

-   qc->scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
return;
}
  
@@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)

cmd->result = SAM_STAT_GOOD;
}
  
-	qc->scsidone(cmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  /**
   *atapi_xlat - Initialize PACKET taskfile


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: scsi: convert host_busy to atomic_t series causes regressions for some hardware configurations

2015-08-26 Thread Sergio Callegari

Sure, thanks!

I'll test this weekend.

Best,

Sergio

On 25/08/2015 13:00, Christoph Hellwig wrote:

Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
return 1;
  }
  
+static void ata_qc_done(struct ata_queued_cmd *qc)

+{
+   struct scsi_cmnd *cmd = qc-scsicmd;
+   void (*done)(struct scsi_cmnd *) = qc-scsidone;
+
+   ata_qc_free(qc);
+   done(cmd);
+}
+
  static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
  {
struct ata_port *ap = qc-ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
if (need_sense  !ap-ops-error_handler)
ata_dump_status(ap-print_id, qc-result_tf);
  
-	qc-scsidone(cmd);

-
-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /**

@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)
ata_gen_passthru_sense(qc);
}
  
-	qc-scsidone(qc-scsicmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  
  /* is it pointless to prefer PIO for safety reasons? */

@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
qc-dev-sdev-locked = 0;
  
  		qc-scsicmd-result = SAM_STAT_CHECK_CONDITION;

-   qc-scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
return;
}
  
@@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)

cmd-result = SAM_STAT_GOOD;
}
  
-	qc-scsidone(cmd);

-   ata_qc_free(qc);
+   ata_qc_done(qc);
  }
  /**
   *atapi_xlat - Initialize PACKET taskfile


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: "scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-08-25 Thread Christoph Hellwig
Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
return 1;
 }
 
+static void ata_qc_done(struct ata_queued_cmd *qc)
+{
+   struct scsi_cmnd *cmd = qc->scsicmd;
+   void (*done)(struct scsi_cmnd *) = qc->scsidone;
+
+   ata_qc_free(qc);
+   done(cmd);
+}
+
 static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
 {
struct ata_port *ap = qc->ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
if (need_sense && !ap->ops->error_handler)
ata_dump_status(ap->print_id, >result_tf);
 
-   qc->scsidone(cmd);
-
-   ata_qc_free(qc);
+   ata_qc_done(qc);
 }
 
 /**
@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)
ata_gen_passthru_sense(qc);
}
 
-   qc->scsidone(qc->scsicmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
 }
 
 /* is it pointless to prefer PIO for "safety reasons"? */
@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
qc->dev->sdev->locked = 0;
 
qc->scsicmd->result = SAM_STAT_CHECK_CONDITION;
-   qc->scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
return;
}
 
@@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
cmd->result = SAM_STAT_GOOD;
}
 
-   qc->scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
 }
 /**
  * atapi_xlat - Initialize PACKET taskfile
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: scsi: convert host_busy to atomic_t series causes regressions for some hardware configurations

2015-08-25 Thread Christoph Hellwig
Hi Sergio,

can you give the patch below a try?

libata currently completes the SCSI command before freeing the internal
command structure, which could lead to various races that mess with
the ATA command state, which might cause issues like the one you see.

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 641a61a..92cc156 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1772,6 +1772,15 @@ nothing_to_do:
return 1;
 }
 
+static void ata_qc_done(struct ata_queued_cmd *qc)
+{
+   struct scsi_cmnd *cmd = qc-scsicmd;
+   void (*done)(struct scsi_cmnd *) = qc-scsidone;
+
+   ata_qc_free(qc);
+   done(cmd);
+}
+
 static void ata_scsi_qc_complete(struct ata_queued_cmd *qc)
 {
struct ata_port *ap = qc-ap;
@@ -1810,9 +1819,7 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
if (need_sense  !ap-ops-error_handler)
ata_dump_status(ap-print_id, qc-result_tf);
 
-   qc-scsidone(cmd);
-
-   ata_qc_free(qc);
+   ata_qc_done(qc);
 }
 
 /**
@@ -2611,8 +2618,7 @@ static void atapi_sense_complete(struct ata_queued_cmd 
*qc)
ata_gen_passthru_sense(qc);
}
 
-   qc-scsidone(qc-scsicmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
 }
 
 /* is it pointless to prefer PIO for safety reasons? */
@@ -2707,8 +2713,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
qc-dev-sdev-locked = 0;
 
qc-scsicmd-result = SAM_STAT_CHECK_CONDITION;
-   qc-scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
return;
}
 
@@ -2752,8 +2757,7 @@ static void atapi_qc_complete(struct ata_queued_cmd *qc)
cmd-result = SAM_STAT_GOOD;
}
 
-   qc-scsidone(cmd);
-   ata_qc_free(qc);
+   ata_qc_done(qc);
 }
 /**
  * atapi_xlat - Initialize PACKET taskfile
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


"scsi: convert host_busy to atomic_t" series causes regressions for some hardware configurations

2015-08-24 Thread Sergio Callegari

Thanks Christoph for the answer!

Apparently I missed a piece of the thread where the test patch was originally 
proposed . Now, I have gone through it and I see how the patch was not meant to 
be a final correction.


My (possibly naive) understanding is that:

- Even if this might be due to hardware that not fully conforms to the standard 
(but we do not know right now), commit 74665016086615bbaa3fa6f83af410a0a4e029ee 
( scsi: convert host_busy to atomic_t ) certainly breaks the kernel for some 
hardware configurations causing a regression.


- If the regression was immediately spotted, the patch would probably have been 
revised right after proposal. Unfortunately, another bug - that got fixed only 
much later with 045065d8a300a37218c - hid the original issue for a long time.


- Now that a lot of time has passed with the "scsi: convert host_busy to 
atomic_t" series in the kernel, going back to look into it is much more 
difficult. Libata people might not be very interested as they moved to other 
topics and might need a lot of time to go through it (it has been known since 
November 2014 - 9 months ago), possibly due to the race like nature of the issue 
and the fact that the bug might not be reproducible on their hardware...


Is this correct?

Aren't commits that cause regressions confirmed by multiple users expected (at 
least in principle) to be reverted?


If reverting is too costy, wouldn't your "papering over" or making the scsi 
delay configurable be an acceptable solution?


Even better: can in some way the libata-people be helped find the real culprit, 
given that there are at least two hardware setups that are known to trigger the 
regression (mine and Barto's)?


I have tried the linux-ide mailing list, but got silence.

Best,

Sergio



On 20/08/2015 10:08, Christoph Hellwig wrote:

Hi Sergio,

On Tue, Aug 18, 2015 at 09:44:28AM +0200, Sergio Callegari wrote:

Hi,

I have bisected the issue down to

[045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang problem

Bisecting has been a painful job due to the fact that the bug may show only
many hours after the system boot.

The commit above in fact is not the culprit, but a fix to an issue that was
hiding the real bug on my system.  See

http://marc.info/?l=linux-kernel=143973820612978=2

The real issue is with sata host lock and seems to be biting a few other
people as well

https://bbs.archlinux.org/viewtopic.php?id=189324

A patch fixing the issue was sent to the LKML back in Nov 2014 by Christoph
Hellwig (who is reading in CC)

https://lkml.org/lkml/2014/11/20/581

I have tested the patch and it works for me.

What is expected to happen now?

As mentioned in that thread we need more input from the libata people
on what kind of race this is papering over.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


scsi: convert host_busy to atomic_t series causes regressions for some hardware configurations

2015-08-24 Thread Sergio Callegari

Thanks Christoph for the answer!

Apparently I missed a piece of the thread where the test patch was originally 
proposed . Now, I have gone through it and I see how the patch was not meant to 
be a final correction.


My (possibly naive) understanding is that:

- Even if this might be due to hardware that not fully conforms to the standard 
(but we do not know right now), commit 74665016086615bbaa3fa6f83af410a0a4e029ee 
( scsi: convert host_busy to atomic_t ) certainly breaks the kernel for some 
hardware configurations causing a regression.


- If the regression was immediately spotted, the patch would probably have been 
revised right after proposal. Unfortunately, another bug - that got fixed only 
much later with 045065d8a300a37218c - hid the original issue for a long time.


- Now that a lot of time has passed with the scsi: convert host_busy to 
atomic_t series in the kernel, going back to look into it is much more 
difficult. Libata people might not be very interested as they moved to other 
topics and might need a lot of time to go through it (it has been known since 
November 2014 - 9 months ago), possibly due to the race like nature of the issue 
and the fact that the bug might not be reproducible on their hardware...


Is this correct?

Aren't commits that cause regressions confirmed by multiple users expected (at 
least in principle) to be reverted?


If reverting is too costy, wouldn't your papering over or making the scsi 
delay configurable be an acceptable solution?


Even better: can in some way the libata-people be helped find the real culprit, 
given that there are at least two hardware setups that are known to trigger the 
regression (mine and Barto's)?


I have tried the linux-ide mailing list, but got silence.

Best,

Sergio



On 20/08/2015 10:08, Christoph Hellwig wrote:

Hi Sergio,

On Tue, Aug 18, 2015 at 09:44:28AM +0200, Sergio Callegari wrote:

Hi,

I have bisected the issue down to

[045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang problem

Bisecting has been a painful job due to the fact that the bug may show only
many hours after the system boot.

The commit above in fact is not the culprit, but a fix to an issue that was
hiding the real bug on my system.  See

http://marc.info/?l=linux-kernelm=143973820612978w=2

The real issue is with sata host lock and seems to be biting a few other
people as well

https://bbs.archlinux.org/viewtopic.php?id=189324

A patch fixing the issue was sent to the LKML back in Nov 2014 by Christoph
Hellwig (who is reading in CC)

https://lkml.org/lkml/2014/11/20/581

I have tested the patch and it works for me.

What is expected to happen now?

As mentioned in that thread we need more input from the libata people
on what kind of race this is papering over.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/