Re: [Open-FCoE] [PATCH v2 02/16] scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly
On Fri, Oct 28, 2016 at 11:53:46AM +0200, Steffen Maier wrote: [...] > > > > > > > @@ -3937,6 +3944,7 @@ fc_bsg_request_handler(struct request_queue *q, > > > > struct Scsi_Host *shost, > > > > struct request *req; > > > > struct fc_bsg_job *job; > > > > enum fc_dispatch_result ret; > > > > + struct fc_bsg_reply *bsg_reply; > > > > > > > > if (!get_device(dev)) > > > > return; > > > > @@ -3973,8 +3981,9 @@ fc_bsg_request_handler(struct request_queue *q, > > > > struct Scsi_Host *shost, > > > > /* check if we have the msgcode value at least */ > > > > if (job->request_len < sizeof(uint32_t)) { > > > > BUG_ON(job->reply_len < sizeof(uint32_t)); > > > > - job->reply->reply_payload_rcv_len = 0; > > > > - job->reply->result = -ENOMSG; > > > > + bsg_reply = job->reply; > > > > + bsg_reply->reply_payload_rcv_len = 0; > > > > + bsg_reply->result = -ENOMSG; > > Compiler optimization re-ordered above two lines and the first pointer > derefence is bsg_reply->result [field offset 0] where bsg_reply is NULL. > The assignment tries to write to memory at address NULL causing the kernel > page fault. > > Does your suggested change for [PATCH v3 02/16], shuffling the > job->request_len checks, address above kernel page fault? This is what I hope at least. I'm sorry but I don't have any experience with s390 and zfcp at all. I still need to get a test environment set up, but all the people knowing how to do are rather busy at the moment. All my tests on x86_64 with FCoE and lpfc haven't had a problem so far. Thanks, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ___ fcoe-devel mailing list fcoe-devel@open-fcoe.org http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel
Re: [Open-FCoE] [PATCH v2 02/16] scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly
On 10/28/2016 11:53 AM, Steffen Maier wrote: > > > On 10/13/2016 06:24 PM, Johannes Thumshirn wrote: >> On Thu, Oct 13, 2016 at 05:15:25PM +0200, Steffen Maier wrote: >>> I'm puzzled. >>> >>> $ git bisect start fc_bsg master > 3087864ce3d7282f59021245d8a5f83ef1caef18 is the first bad commit commit 3087864ce3d7282f59021245d8a5f83ef1caef18 Author: Johannes Thumshirn Date: Wed Oct 12 15:06:28 2016 +0200 scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly Don't use fc_bsg_job::request and fc_bsg_job::reply directly, but use helper variables bsg_request and bsg_reply. This will be helpfull when transitioning to bsg-lib. Signed-off-by: Johannes Thumshirn :04 04 140c4b6829d5cfaec4079716e0795f63f8bc3bd2 0d9fe225615679550be91fbd9f84c09ab1e280fc Mdrivers >>> >>> From there (on the reverse bisect path) I get the following Oops, >>> except for the full patch set having another stack trace as in my >>> previous >>> mail (dying in zfcp code). >>> >> >> [...] >> >>> @@ -3937,6 +3944,7 @@ fc_bsg_request_handler(struct request_queue *q, struct Scsi_Host *shost, struct request *req; struct fc_bsg_job *job; enum fc_dispatch_result ret; +struct fc_bsg_reply *bsg_reply; if (!get_device(dev)) return; @@ -3973,8 +3981,9 @@ fc_bsg_request_handler(struct request_queue *q, struct Scsi_Host *shost, /* check if we have the msgcode value at least */ if (job->request_len < sizeof(uint32_t)) { BUG_ON(job->reply_len < sizeof(uint32_t)); -job->reply->reply_payload_rcv_len = 0; -job->reply->result = -ENOMSG; +bsg_reply = job->reply; +bsg_reply->reply_payload_rcv_len = 0; +bsg_reply->result = -ENOMSG; > > Compiler optimization re-ordered above two lines and the first pointer > derefence is bsg_reply->result [field offset 0] where bsg_reply is NULL. > The assignment tries to write to memory at address NULL causing the > kernel page fault. > I spoke to our compiler people, and they strongly believed this not to be the case. Or, put it the other way round, if such a thing would happen it would be a compiler issue. Have you checked the compiler output? Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ___ fcoe-devel mailing list fcoe-devel@open-fcoe.org http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel
Re: [Open-FCoE] [PATCH v2 02/16] scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly
On Thu, Oct 13, 2016 at 05:15:25PM +0200, Steffen Maier wrote: > I'm puzzled. > > $ git bisect start fc_bsg master > Bisecting: 8 revisions left to test after this (roughly 3 steps) > [005d51510eee6102636d5dbb06310531c5d46151] scsi: fc: implement kref backed > reference counting > $ git bisect bad > Bisecting: 3 revisions left to test after this (roughly 2 steps) > [bef6da201de1bb81bb4d9511f9a155862efc251f] scsi: Unify interfaces of > fc_bsg_jobdone and bsg_job_done > $ git bisect bad > Bisecting: 1 revision left to test after this (roughly 1 step) > [3087864ce3d7282f59021245d8a5f83ef1caef18] scsi: don't use > fc_bsg_job::request and fc_bsg_job::reply directly > $ git bisect bad > Bisecting: 0 revisions left to test after this (roughly 0 steps) > [81aea44720d22d2e0c4a2613ae8b1c256ef6b0cb] scsi: Get rid of struct > fc_bsg_buffer > > $ git bisect good > > 3087864ce3d7282f59021245d8a5f83ef1caef18 is the first bad commit > > commit 3087864ce3d7282f59021245d8a5f83ef1caef18 > > Author: Johannes Thumshirn > > Date: Wed Oct 12 15:06:28 2016 +0200 > > > > scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly > > > > Don't use fc_bsg_job::request and fc_bsg_job::reply directly, but use > > helper variables bsg_request and bsg_reply. This will be helpfull when > > transitioning to bsg-lib. > > > > Signed-off-by: Johannes Thumshirn > > > > :04 04 140c4b6829d5cfaec4079716e0795f63f8bc3bd2 > > 0d9fe225615679550be91fbd9f84c09ab1e280fc M drivers > > From there (on the reverse bisect path) I get the following Oops, > except for the full patch set having another stack trace as in my previous > mail (dying in zfcp code). > [...] > > > @@ -3937,6 +3944,7 @@ fc_bsg_request_handler(struct request_queue *q, > > struct Scsi_Host *shost, > > struct request *req; > > struct fc_bsg_job *job; > > enum fc_dispatch_result ret; > > + struct fc_bsg_reply *bsg_reply; > > > > if (!get_device(dev)) > > return; > > @@ -3973,8 +3981,9 @@ fc_bsg_request_handler(struct request_queue *q, > > struct Scsi_Host *shost, > > /* check if we have the msgcode value at least */ > > if (job->request_len < sizeof(uint32_t)) { > > BUG_ON(job->reply_len < sizeof(uint32_t)); > > - job->reply->reply_payload_rcv_len = 0; > > - job->reply->result = -ENOMSG; > > + bsg_reply = job->reply; > > + bsg_reply->reply_payload_rcv_len = 0; > > + bsg_reply->result = -ENOMSG; > > job->reply_len = sizeof(uint32_t); > > fc_bsg_jobdone(job); > > spin_lock_irq(q->queue_lock); > > Ahm and what exactly can break here? It's just assigning variables. Now I'm puzzled too. I'll have to look into it tomorrow. Byte, Johannes -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ___ fcoe-devel mailing list fcoe-devel@open-fcoe.org http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel
Re: [Open-FCoE] [PATCH v2 02/16] scsi: don't use fc_bsg_job::request and fc_bsg_job::reply directly
On 10/12/2016 03:06 PM, Johannes Thumshirn wrote: > Don't use fc_bsg_job::request and fc_bsg_job::reply directly, but use > helper variables bsg_request and bsg_reply. This will be helpfull when > transitioning to bsg-lib. > > Signed-off-by: Johannes Thumshirn > --- > drivers/s390/scsi/zfcp_fc.c | 9 +- > drivers/scsi/bfa/bfad_bsg.c | 40 +++--- > drivers/scsi/ibmvscsi/ibmvfc.c | 22 ++-- > drivers/scsi/libfc/fc_lport.c| 23 ++-- > drivers/scsi/lpfc/lpfc_bsg.c | 194 +--- > drivers/scsi/qla2xxx/qla_bsg.c | 264 > ++- > drivers/scsi/qla2xxx/qla_iocb.c | 5 +- > drivers/scsi/qla2xxx/qla_isr.c | 46 --- > drivers/scsi/qla2xxx/qla_mr.c| 10 +- > drivers/scsi/scsi_transport_fc.c | 37 +++--- > 10 files changed, 387 insertions(+), 263 deletions(-) > Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) ___ fcoe-devel mailing list fcoe-devel@open-fcoe.org http://lists.open-fcoe.org/mailman/listinfo/fcoe-devel