Re: [PATCH V3 7/8] scsi: hpsa: improve scsi_mq performance via .host_tagset
> + /* 256 tags should be high enough to saturate device */ > + int max_queues = DIV_ROUND_UP(h->scsi_host->can_queue, 256); > + > + /* per NUMA node hw queue */ > + h->scsi_host->nr_hw_queues = min_t(int, nr_node_ids, max_queues); I don't think this magic should be in a driver. The per-node hw_queue selection seems like something we'd better do in the core code. Also the whole idea to use nr_hw_queues for just partitioning tag space on hardware that doesn't really support multiple hardware queues seems more than odd.
Re: [PATCH V3 4/8] blk-mq: introduce BLK_MQ_F_HOST_TAGS
On Tue, Feb 27, 2018 at 06:07:46PM +0800, Ming Lei wrote: > This patch can support to partition host-wide tags to multiple hw queues, > so each hw queue related data structures(tags, hctx) can be accessed in > NUMA locality way, for example, the hw queue can be per NUMA node. > > It is observed IOPS can be improved much in this way on null_blk test. null_blk isn't too interesting, so some real hardware number would be very useful here. Also the documentation should be a lot less sparse. When are we going to set this flag? What help are we going to give driver authors to guide chosing the option?
Re: [PATCH V3 3/8] blk-mq: introduce 'start_tag' field to 'struct blk_mq_tags'
Looks fine, Reviewed-by: Christoph Hellwig
Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
> +static void hpsa_setup_reply_map(struct ctlr_info *h) > +{ > + const struct cpumask *mask; > + unsigned int queue, cpu; > + > + for (queue = 0; queue < h->msix_vectors; queue++) { > + mask = pci_irq_get_affinity(h->pdev, queue); > + if (!mask) > + goto fallback; > + > + for_each_cpu(cpu, mask) > + h->reply_map[cpu] = queue; > + } > + return; > + > +fallback: > + for_each_possible_cpu(cpu) > + h->reply_map[cpu] = 0; > +} It seems a little annoying that we have to duplicate this in the driver. Wouldn't this be solved by your force_blk_mq flag and relying on the hw_ctx id?
,Your urgent confirmation
Attn: Beneficiary, We have contacted the Federal Ministry of Finance on your Behalf and they have brought a solution to your problem by coordinating your payment in total (10,000,000.00) Ten Million Dollars in an atm card which you can use to withdraw money from any ATM MACHINE CENTER anywhere in the world with a maximum of 1 Dollars daily. You now have the lawful right to claim your fund in an atm card. Since the Federal Bureau of Investigation is involved in this transaction, you have to be rest assured for this is 100% risk free it is our duty to protect the American Citizens, European Citizens, Asian Citizen. All I want you to do is to contact the atm card CENTER Via email or call the office telephone number one of the Consultant will assist you for their requirements to proceed and procure your Approval Slip on your behalf. CONTACT INFORMATION NAME: James Williams EMAIL: paymasterofficed...@gmail.com Do contact us with your details: Full name// Address// Age// Telephone Numbers// Occupation// Your Country// Bank Details// So your files would be updated after which the Delivery of your atm card will be affected to your designated home Address without any further delay and the bank will transfer your funds in total (10,000,000.00) Ten Million Dollars to your Bank account. We will reply you with the secret code (1600 atm card). We advice you get back to the payment office after you have contacted the ATM SWIFT CARD CENTER and we do await your response so we can move on with our Investigation and make sure your ATM SWIFT CARD gets to you. Best Regards James Williams Paymaster General Federal Republic Of Nigeri
Re: [PATCH 2/2] m68k/amiga - Zorro ESP: new zorro_esp.c
Tuomas, Am 06.03.2018 um 04:31 schrieb Tuomas Vainikka: >>> I think you are talking about esp->regs? For esp->dma_regs, the >>> ioremap is >>> conditional on ent->id, but the unmap is not. >> The details of the ioremap are conditional on the ID, but the fact >> that the ioremap happens (and hence esp->dma_regs is an ioremapped >> address) is not. All Zorro-3 boards have to have both their regs and >> dma_regs remapped. >> >> What's confusing is that there is only a single Zorro-3 board >> currently supported by the driver. Others will be added and I"ll use a >> switch statement to pick the appropriate address based on the ID. That >> might make it clearer. > > Fastlane might be the only Z3 SCSI board that has the chip. Good to know - I've rewritten the probe code to check for the type of board (Z2 or Z3) based on the ROM data, and make the ioremap/iounmap conditional on that. Cheers, Michael > > -Tuomas > >> >> +} >> + >> +static void zorro_esp_remove_one(struct zorro_dev *z) >> +{ >> + struct Scsi_Host *host = zorro_get_drvdata(z); >> + struct esp *esp = shost_priv(host); >> + >> + scsi_esp_unregister(esp); >> + >> + /* Disable interrupts. Perhaps use disable_irq instead ... */ >> + >> + free_irq(host->irq, esp); >> + dma_free_coherent(esp->dev, 16, >> + esp->command_block, >> + esp->command_block_dma); >> + >> + if (host->base > 0xff) { >> + iounmap(esp->dma_regs); > Do you need to test for ZORRO_PROD_PHASE5_BLIZZARD_1230_IV_1260 first? I can't - ent->id is not available here... >>> Maybe store ent->id in the private struct to get around that? >> Yes, that could be done. I still think it's not needed. >> >> Cheers, >> >>Michael >> >> >>> -- >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-m68k" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >
[Bug 198923] Linux 4.15.4+: Write on Ext4 causes system block
https://bugzilla.kernel.org/show_bug.cgi?id=198923 Theodore Tso (ty...@mit.edu) changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |DUPLICATE --- Comment #10 from Theodore Tso (ty...@mit.edu) --- Thanks for pointing out the dup bugzilla entry. *** This bug has been marked as a duplicate of bug 198861 *** -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 198861] Regression causes kernel OOPS and hang in SCSI error report
https://bugzilla.kernel.org/show_bug.cgi?id=198861 Theodore Tso (ty...@mit.edu) changed: What|Removed |Added CC||m...@elbmurf.de --- Comment #4 from Theodore Tso (ty...@mit.edu) --- *** Bug 198923 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are the assignee for the bug.
Re: [PATCH v3] scsi: libsas: defer ata device eh commands to libata
On Wed, Mar 7, 2018 at 6:34 PM, Jason Yanwrote: > When ata device doing EH, some commands still attached with tasks are not > passed to libata when abort failed or recover failed, so libata did not > handle these commands. After these commands done, sas task is freed, but > ata qc is not freed. This will cause ata qc leak and trigger a warning > like below: > > WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037 > ata_eh_finish+0xb4/0xcc > CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1 > .. > Call trace: > [] ata_eh_finish+0xb4/0xcc > [] ata_do_eh+0xc4/0xd8 > [] ata_std_error_handler+0x44/0x8c > [] ata_scsi_port_error_handler+0x480/0x694 > [] async_sas_ata_eh+0x4c/0x80 > [] async_run_entry_fn+0x4c/0x170 > [] process_one_work+0x144/0x390 > [] worker_thread+0x144/0x418 > [] kthread+0x10c/0x138 > [] ret_from_fork+0x10/0x18 > > If ata qc leaked too many, ata tag allocation will fail and io blocked > for ever. > > As suggested by Dan Williams, defer ata device commands to libata and > merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle > ata qcs correctly after this. > > Signed-off-by: Jason Yan > CC: Xiaofei Tan > CC: John Garry > CC: Dan Williams Looks good, Reviewed-by: Dan Williams
Re: [PATCH v4 01/10] dt-bindings: scsi: hisi_sas: add an property of signal attenuation
On Wed, Mar 07, 2018 at 08:25:05PM +0800, John Garry wrote: > From: Xiaofei Tan> > For some new boards with hip07 chipset we are required to > set PHY config registers differently. The hw property which > determines how to set these registers is in the PHY signal > attenuation readings. > > This patch add an devicetree property, "hisilicon,signal-attenuation", > which is used to describe the signal attenuation of an board. > > Cc: Rob Herring > Cc: Mark Rutland > Signed-off-by: Xiaofei Tan > Signed-off-by: John Garry > --- > Documentation/devicetree/bindings/scsi/hisilicon-sas.txt | 7 +++ > 1 file changed, 7 insertions(+) Reviewed-by: Rob Herring
[PATCH v3] scsi: libsas: defer ata device eh commands to libata
When ata device doing EH, some commands still attached with tasks are not passed to libata when abort failed or recover failed, so libata did not handle these commands. After these commands done, sas task is freed, but ata qc is not freed. This will cause ata qc leak and trigger a warning like below: WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037 ata_eh_finish+0xb4/0xcc CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1 .. Call trace: [] ata_eh_finish+0xb4/0xcc [] ata_do_eh+0xc4/0xd8 [] ata_std_error_handler+0x44/0x8c [] ata_scsi_port_error_handler+0x480/0x694 [] async_sas_ata_eh+0x4c/0x80 [] async_run_entry_fn+0x4c/0x170 [] process_one_work+0x144/0x390 [] worker_thread+0x144/0x418 [] kthread+0x10c/0x138 [] ret_from_fork+0x10/0x18 If ata qc leaked too many, ata tag allocation will fail and io blocked for ever. As suggested by Dan Williams, defer ata device commands to libata and merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle ata qcs correctly after this. Signed-off-by: Jason YanCC: Xiaofei Tan CC: John Garry CC: Dan Williams --- drivers/scsi/libsas/sas_scsi_host.c | 33 + 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index 6de9681ace82..ceab5e5c41c2 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -223,6 +223,7 @@ int sas_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) { struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(cmd->device->host); + struct domain_device *dev = cmd_to_domain_dev(cmd); struct sas_task *task = TO_SAS_TASK(cmd); /* At this point, we only get called following an actual abort @@ -231,6 +232,14 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) */ sas_end_task(cmd, task); + if (dev_is_sata(dev)) { + /* defer commands to libata so that libata EH can +* handle ata qcs correctly +*/ + list_move_tail(>eh_entry, _ha->eh_ata_q); + return; + } + /* now finish the command and move it on to the error * handler done list, this also takes it off the * error handler pending list. @@ -238,22 +247,6 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) scsi_eh_finish_cmd(cmd, _ha->eh_done_q); } -static void sas_eh_defer_cmd(struct scsi_cmnd *cmd) -{ - struct domain_device *dev = cmd_to_domain_dev(cmd); - struct sas_ha_struct *ha = dev->port->ha; - struct sas_task *task = TO_SAS_TASK(cmd); - - if (!dev_is_sata(dev)) { - sas_eh_finish_cmd(cmd); - return; - } - - /* report the timeout to libata */ - sas_end_task(cmd, task); - list_move_tail(>eh_entry, >eh_ata_q); -} - static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd *my_cmd) { struct scsi_cmnd *cmd, *n; @@ -261,7 +254,7 @@ static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd list_for_each_entry_safe(cmd, n, error_q, eh_entry) { if (cmd->device->sdev_target == my_cmd->device->sdev_target && cmd->device->lun == my_cmd->device->lun) - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); } } @@ -631,12 +624,12 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * case TASK_IS_DONE: SAS_DPRINTK("%s: task 0x%p is done\n", __func__, task); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); continue; case TASK_IS_ABORTED: SAS_DPRINTK("%s: task 0x%p is aborted\n", __func__, task); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); continue; case TASK_IS_AT_LU: SAS_DPRINTK("task 0x%p is at LU: lu recover\n", task); @@ -647,7 +640,7 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * "recovered\n", SAS_ADDR(task->dev), cmd->device->lun); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); sas_scsi_clear_queue_lu(work_q, cmd); goto Again; } -- 2.13.6
Re: [PATCH 2/2] m68k/amiga - Zorro ESP: new zorro_esp.c
Hi Geert, fine, I'll rely on the base address and the z->rom.er_Type value to pick the correct config data. Cheers, Michael On Wed, Mar 7, 2018 at 9:06 PM, Geert Uytterhoevenwrote: > Hi Michael, > > On Wed, Mar 7, 2018 at 8:55 AM, Michael Schmitz wrote: >> OK, in that case I'll need to work out something similar to the test for >> optional SCSI function on the Blizzard 1230/1260 to find out what board >> I have when dealing with the duplicate Fastlane/Blizzard1230II ID. >> >> Is the board base address as returned by zorro_resource_start() reliable >> to distinguish between Zorro II and Zorro III boards? > > The board base address is assigned by AmigaOS based on the values in the > Expansion ROM (mainly ExpansionRom.er_Type) on the board. > More specifically, AmigaOS creates struct ConfigDev from struct ExpansionRom. > So yes, it must be reliable. > >> Am 06.03.2018 um 20:43 schrieb Geert Uytterhoeven: >>> On Tue, Mar 6, 2018 at 2:11 AM, Michael Schmitz >>> wrote: Index 1 should have been ZORRO_PROD_PHASE5_CYBERSTORM_MK_II, I've corrected that in the meantime. Fastlane / Blizzard 1230_II distinction is something I an not quite sure about - does the probe function get called twice if the device table contains the same ID twice but with different driver_data contents? >>> >>> No, the probe function gets called on the first match only. >>> Cfr. drivers/zorro/zorro-driver.c:zorro_device_probe(). > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
Re: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset
On Wed, Mar 07, 2018 at 10:58:34PM +0530, Kashyap Desai wrote: > > > > > > Also one observation using V3 series patch. I am seeing below Affinity > > > mapping whereas I have only 72 logical CPUs. It means we are really > > > not going to use all reply queues. > > > e.a If I bind fio jobs on CPU 18-20, I am seeing only one reply queue > > > is used and that may lead to performance drop as well. > > > > If the mapping is in such shape, I guess it should be quite difficult to > figure out > > one perfect way to solve this situation because one reply queue has to > handle > > IOs submitted from 4~5 CPUs at average. > > 4.15.0-rc1 kernel has below mapping - I am not sure which commit id in " > linux_4.16-rc-host-tags-v3.2" is changing the mapping of IRQ to CPU. It I guess the mapping you posted is read from /proc/irq/126/smp_affinity. If yes, no any patch in linux_4.16-rc-host-tags-v3.2 should change IRQ affinity code, which is done in irq_create_affinity_masks(), as you saw, no any patch in linux_4.16-rc-host-tags-v3.2 touches that code. Could you simply apply the patches in linux_4.16-rc-host-tags-v3.2 against 4.15-rc1 kernel and see any difference? > will be really good if we can fall back to below mapping once again. > Current repo linux_4.16-rc-host-tags-v3.2 is giving lots of random mapping > of CPU - MSIx. And that will be problematic in performance run. > > As I posted earlier, latest repo will only allow us to use *18* reply Looks not see this report before, could you share us how you conclude that? The only patch changing reply queue is the following one: https://marc.info/?l=linux-block=151972611911593=2 But not see any issue in this patch yet, can you recover to 72 reply queues after reverting the patch in above link? > queue instead of *72*. Lots of performance related issue can be pop up on > different setup due to inconsistency in CPU - MSIx mapping. BTW, changes > in this area is intentional @" linux_4.16-rc-host-tags-v3.2". ? As you mentioned in the following link, you didn't see big performance drop with linux_4.16-rc-host-tags-v3.2, right? https://marc.info/?l=linux-block=151982993810092=2 Thanks, Ming
[PATCH] qla2xxx: Remove FC_NO_LOOP_ID for FCP and FC-NVMe Discovery
Commit 7d64c39e64310 fixed regression of FCP discovery when Nport Handle is in-use and relogin is triggered. However, during FCP and FC-NVMe discovery this resulted into only discovering NVMe LUNs. This patch fixes issue where FCP and FC-NVMe protocol is used on same port where assigning FC_NO_LOOP_ID will result into discovery failure for FCP LUNs. Fixes: a084fd68e1d26 ("scsi: qla2xxx: Fix re-login for Nport Handle in use") Signed-off-by: Himanshu Madhani--- Hi Martin, This patch fixes the FCP discovery on same port with FC-NVMe enabled. Please apply to 4.16/scsi-fixes at your earliest convenience. Thanks, Himanshu --- drivers/scsi/qla2xxx/qla_init.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index c4b04e807e12..cb182b102bfd 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -1719,7 +1719,6 @@ qla24xx_handle_plogi_done_event(struct scsi_qla_host *vha, struct event_arg *ea) set_bit(ea->fcport->loop_id, vha->hw->loop_id_map); spin_lock_irqsave(>hw->tgt.sess_lock, flags); - ea->fcport->loop_id = FC_NO_LOOP_ID; ea->fcport->chip_reset = vha->hw->base_qpair->chip_reset; ea->fcport->logout_on_delete = 1; ea->fcport->send_els_logo = 0; -- 2.12.0
Re: [PATCH 6/6] lib/scatterlist: Drop order argument from sgl_free_n_order
On Wed, 2018-03-07 at 12:47 +, Tvrtko Ursulin wrote: > From: Tvrtko UrsulinFirstly, I don't see any justifiable benefit to churning this API, so why bother? but secondly this: > We can derive the order from sg->length and so do not need to pass it > in explicitly. Is wrong. I can have a length 2 scatterlist that crosses a page boundary, but I can also have one within a single page, so the order cannot be deduced from the length. James
Re: [PATCH 6/6] lib/scatterlist: Drop order argument from sgl_free_n_order
On Wed, 2018-03-07 at 17:23 +, Tvrtko Ursulin wrote: > Ok I guess my main questions are the ones from the cover letter - where > is this API going and why did it get in a bit of a funky state? Because > it doesn't look fully thought through and tested to me. Funky state? Not fully tested? Except for the error paths and upper length limits the sgl allocation and freeing functions have been tested thoroughly. > My motivation is that I would like to extend it to add > sgl_alloc_order_min_max, which takes min order and max order, and > allocates as large chunks as it can given those constraints. This is > something we have in i915 and could then drop our implementation and use > the library function. That sounds useful to me. > But I also wanted to refactor sgl_alloc_order to benefit from the > existing chained struct scatterlist allocator. But SGL API does not > embed into struct sg_table, neither it carries explicitly the number of > nents allocated, making it impossible to correctly free with existing > sg_free_table. It is on purpose that sgl_alloc_order() returns a struct scatterlist instead of any structure built on top of struct scatterlist. If you have a look at the sgl_alloc*() callers then you will see that nontrivial changes in these callers are required to make them use something else than a struct scatterlist pointer. But if you would like to rework those callers that's fine with me. I can help with reviewing the code I'm familiar with. > Also I am not sure if a single gfp argument to sgl_alloc_order is the > right thing to do. I have a feeling you either need to ignore it for > kmalloc_array, or pass in two gfp_t arguments to be used for metadata > and backing storage respectively. If there is a caller that needs this feel free to make this change. But please don't make this change before there is a caller that needs it. Thanks, Bart.
RE: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset
> > > > Also one observation using V3 series patch. I am seeing below Affinity > > mapping whereas I have only 72 logical CPUs. It means we are really > > not going to use all reply queues. > > e.a If I bind fio jobs on CPU 18-20, I am seeing only one reply queue > > is used and that may lead to performance drop as well. > > If the mapping is in such shape, I guess it should be quite difficult to figure out > one perfect way to solve this situation because one reply queue has to handle > IOs submitted from 4~5 CPUs at average. 4.15.0-rc1 kernel has below mapping - I am not sure which commit id in " linux_4.16-rc-host-tags-v3.2" is changing the mapping of IRQ to CPU. It will be really good if we can fall back to below mapping once again. Current repo linux_4.16-rc-host-tags-v3.2 is giving lots of random mapping of CPU - MSIx. And that will be problematic in performance run. As I posted earlier, latest repo will only allow us to use *18* reply queue instead of *72*. Lots of performance related issue can be pop up on different setup due to inconsistency in CPU - MSIx mapping. BTW, changes in this area is intentional @" linux_4.16-rc-host-tags-v3.2". ? irq 218, cpu list 0 irq 219, cpu list 1 irq 220, cpu list 2 irq 221, cpu list 3 irq 222, cpu list 4 irq 223, cpu list 5 irq 224, cpu list 6 irq 225, cpu list 7 irq 226, cpu list 8 irq 227, cpu list 9 irq 228, cpu list 10 irq 229, cpu list 11 irq 230, cpu list 12 irq 231, cpu list 13 irq 232, cpu list 14 irq 233, cpu list 15 irq 234, cpu list 16 irq 235, cpu list 17 irq 236, cpu list 36 irq 237, cpu list 37 irq 238, cpu list 38 irq 239, cpu list 39 irq 240, cpu list 40 irq 241, cpu list 41 irq 242, cpu list 42 irq 243, cpu list 43 irq 244, cpu list 44 irq 245, cpu list 45 irq 246, cpu list 46 irq 247, cpu list 47 irq 248, cpu list 48 irq 249, cpu list 49 irq 250, cpu list 50 irq 251, cpu list 51 irq 252, cpu list 52 irq 253, cpu list 53 irq 254, cpu list 18 irq 255, cpu list 19 irq 256, cpu list 20 irq 257, cpu list 21 irq 258, cpu list 22 irq 259, cpu list 23 irq 260, cpu list 24 irq 261, cpu list 25 irq 262, cpu list 26 irq 263, cpu list 27 irq 264, cpu list 28 irq 265, cpu list 29 irq 266, cpu list 30 irq 267, cpu list 31 irq 268, cpu list 32 irq 269, cpu list 33 irq 270, cpu list 34 irq 271, cpu list 35 irq 272, cpu list 54 irq 273, cpu list 55 irq 274, cpu list 56 irq 275, cpu list 57 irq 276, cpu list 58 irq 277, cpu list 59 irq 278, cpu list 60 irq 279, cpu list 61 irq 280, cpu list 62 irq 281, cpu list 63 irq 282, cpu list 64 irq 283, cpu list 65 irq 284, cpu list 66 irq 285, cpu list 67 irq 286, cpu list 68 irq 287, cpu list 69 irq 288, cpu list 70 irq 289, cpu list 71 > > The application should have the knowledge to avoid this kind of usage. > > > Thanks, > Ming
Re: [PATCH 6/6] lib/scatterlist: Drop order argument from sgl_free_n_order
On 07/03/18 16:23, Bart Van Assche wrote: On Wed, 2018-03-07 at 12:47 +, Tvrtko Ursulin wrote: We can derive the order from sg->length and so do not need to pass it in explicitly. Rename the function to sgl_free_n. Using get_order() to compute the order looks fine to me but this patch will have to rebased in order to address the comments on the previous patches. Ok I guess my main questions are the ones from the cover letter - where is this API going and why did it get in a bit of a funky state? Because it doesn't look fully thought through and tested to me. My motivation is that I would like to extend it to add sgl_alloc_order_min_max, which takes min order and max order, and allocates as large chunks as it can given those constraints. This is something we have in i915 and could then drop our implementation and use the library function. But I also wanted to refactor sgl_alloc_order to benefit from the existing chained struct scatterlist allocator. But SGL API does not embed into struct sg_table, neither it carries explicitly the number of nents allocated, making it impossible to correctly free with existing sg_free_table. Another benefit of using the existing sg allocator would be that for large allocation you don't depend on the availability of contiguous chunks like you do with kmalloc_array. For instance if in another reply you mentioned 4GiB allocations are a possibility. If you use order 0 that means you need 1M nents, which can be something like 32 bytes each and you need a 32MiB kmalloc for the nents array and thats quite big. If you would be able to reuse the existing sg_alloc_table infrastructure (I have patches which extract it if you don't want to deal with struct sg_table), you would benefit from PAGE_SIZE allocations. Also I am not sure if a single gfp argument to sgl_alloc_order is the right thing to do. I have a feeling you either need to ignore it for kmalloc_array, or pass in two gfp_t arguments to be used for metadata and backing storage respectively. So I have many questions regarding the current state and future direction, but essentially would like to make it usable for other drivers, like i915, as well. Regards, Tvrtko
Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"
On 2018-03-07 09:02 AM, Menion wrote: 2018-03-07 14:51 GMT+01:00 Steffen Maier <ma...@linux.vnet.ibm.com>: On 03/07/2018 09:24 AM, Menion wrote: ... but from then on, you only get it roughly once every 300 seconds, i.e. 5 minutes that's where I suspect user space as trigger, unless there is a kernel feature I'm not aware of doing such sdev rescans preventing this would be a workaround Is it possible that it is smartd? It is the only daemon that could do some low level access to the device (bypassing the filesystem) If you wait about 5 hours from the time of this post then go to the smartmontools mirror at: https://github.com/mirror/smartmontools To check it is the revision (svn rev >= 4718) you need for this fix, look at the top of the ChangeLog file and look for today's date (20180307). Assuming it is there, clone it then try to build smartmontools ( './autogen.sh ; ./configure ; make install') and try the new smartd ***. You should get one warning per 8 TB device (for each run of smartd) and no more. Currently smartmontools only has a quirks database (and it is large) for ATA devices, not real or pseudo SCSI device, nor NVMe devices (yet). Hopefully this fix will be sufficient. If it does not work, please send me the details. Doug Gilbert *** without a --prefix=/usr/sbin or similar option to .configure I think that smartd will be placed in /usr/local/sbin which may be different from where your distro places it. Your PATH will determine which one is used. /* * Many devices do not respond properly to READ_CAPACITY_16. * Tell the SCSI layer to try READ_CAPACITY_10 first. * However some USB 3.0 drive enclosures return capacity * modulo 2TB. Those must use READ_CAPACITY_16 */ if (!(us->fflags & US_FL_NEEDS_CAP16)) sdev->try_rc_10_first = 1; if that's the cause, maybe an entry in drivers/usb/storage/unusual_devs.h would help, but that's really just guessing as I'm not familiar with USB It seems that the bridge does have an entry in unusual_devs.h: /* Reported by Michael Büsch <m...@bues.ch> */ UNUSUAL_DEV( 0x152d, 0x0567, 0x0114, 0x0116, "JMicron", "USB to ATA/ATAPI Bridge", USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_BROKEN_FUA ), VID:PID is 0x152d 0x0567, not sure what are the other two numbers, so I went back and used another enclosure with same USB to SATA bridge. The strange thing is that this other enclosure goes in UAS mode while the one for which I am reporting the issue goes in usb-storage mode because it gets somehow the quirks 0x5000 Unfortunately I cannot move these 5 HDDs in the other enclosure. So do you think that it shall be reported to linux-usb maybe?
[Bug 198861] Regression causes kernel OOPS and hang in SCSI error report
https://bugzilla.kernel.org/show_bug.cgi?id=198861 Bart Van Assche (bvanass...@acm.org) changed: What|Removed |Added CC||bvanass...@acm.org --- Comment #3 from Bart Van Assche (bvanass...@acm.org) --- This has been fixed by the following commit, which is in James' tree but not yet in Linus' tree nor any of the stable trees: 3be8828fc507 ("scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops"). See also https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?h=fixes=3be8828fc507cdafe7040a3dcf361a2bcd8e305b. A pull request that includes that commit has been sent to Linus. See also https://lkml.org/lkml/2018/3/6/767. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 198923] Linux 4.15.4+: Write on Ext4 causes system block
https://bugzilla.kernel.org/show_bug.cgi?id=198923 Bart Van Assche (bvanass...@acm.org) changed: What|Removed |Added CC||bvanass...@acm.org --- Comment #9 from Bart Van Assche (bvanass...@acm.org) --- Please close this bugzilla entry as a duplicate of #198861. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 198923] Linux 4.15.4+: Write on Ext4 causes system block
https://bugzilla.kernel.org/show_bug.cgi?id=198923 Theodore Tso (ty...@mit.edu) changed: What|Removed |Added Status|REOPENED|ASSIGNED Component|ext4|Other Assignee|fs_e...@kernel-bugs.osdl.or |scsi_drivers-other@kernel-b |g |ugs.osdl.org Product|File System |SCSI Drivers --- Comment #8 from Theodore Tso (ty...@mit.edu) --- The people you need to contact are on the linux-scsi@vger.kernel.org mailing list. They are extremely unlikely to be following this bugzilla issue. You'll need to ping the mailing list thread on the linux-scsi mailing list. Or contact the two SCSI maintainers: SCSI SUBSYSTEM M: "James E.J. Bottomley"T: git git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git M: "Martin K. Petersen" T: git git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git Or try contacting the patch author in question, Bart Van Assche . There's not much we (the ext4 developers) can do here to help. -- You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v2] scsi: libsas: defer ata device eh commands to libata
On 07/03/2018 09:18, Jack Wang wrote: 2018-03-07 8:47 GMT+01:00 Jason Yan: When ata device doing EH, some commands still attached with tasks are not passed to libata when abort failed or recover failed, so libata did not handle these commands. After these commands done, sas task is freed, but ata qc is not freed. This will cause ata qc leak and trigger a warning like below: WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037 ata_eh_finish+0xb4/0xcc CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1 .. Call trace: [] ata_eh_finish+0xb4/0xcc [] ata_do_eh+0xc4/0xd8 [] ata_std_error_handler+0x44/0x8c [] ata_scsi_port_error_handler+0x480/0x694 [] async_sas_ata_eh+0x4c/0x80 [] async_run_entry_fn+0x4c/0x170 [] process_one_work+0x144/0x390 [] worker_thread+0x144/0x418 [] kthread+0x10c/0x138 [] ret_from_fork+0x10/0x18 If ata qc leaked too many, ata tag allocation will fail and io blocked for ever. As suggested by Dan Williams, defer ata device commands to libata and merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle ata qcs correctly after this. Signed-off-by: Jason Yan CC: Xiaofei Tan CC: John Garry CC: Dan Williams Hi Jason, Would be good if you add the Fixes tag, so distribution could pick it up easily. And wasn't the idea to delete sas_eh_finish_cmd instead of sas_eh_defer_cmd? The problem was that we need to defer any ATA command. So the solution was to just delete sas_eh_defer_cmd() and make sas_eh_finish_cmd() do what needs to be done for either ATA or non-ATA task. Thanks, John Thanks, Jack --- drivers/scsi/libsas/sas_scsi_host.c | 30 ++ 1 file changed, 10 insertions(+), 20 deletions(-) diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index 6de9681ace82..fd76436b289c 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -223,6 +223,7 @@ int sas_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) { struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(cmd->device->host); + struct domain_device *dev = cmd_to_domain_dev(cmd); struct sas_task *task = TO_SAS_TASK(cmd); /* At this point, we only get called following an actual abort @@ -231,6 +232,11 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) */ sas_end_task(cmd, task); + if (dev_is_sata(dev)) { + list_move_tail(>eh_entry, _ha->eh_ata_q); + return; + } + /* now finish the command and move it on to the error * handler done list, this also takes it off the * error handler pending list. @@ -238,22 +244,6 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) scsi_eh_finish_cmd(cmd, _ha->eh_done_q); } -static void sas_eh_defer_cmd(struct scsi_cmnd *cmd) -{ - struct domain_device *dev = cmd_to_domain_dev(cmd); - struct sas_ha_struct *ha = dev->port->ha; - struct sas_task *task = TO_SAS_TASK(cmd); - - if (!dev_is_sata(dev)) { - sas_eh_finish_cmd(cmd); - return; - } - - /* report the timeout to libata */ - sas_end_task(cmd, task); - list_move_tail(>eh_entry, >eh_ata_q); -} - static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd *my_cmd) { struct scsi_cmnd *cmd, *n; @@ -261,7 +251,7 @@ static void sas_scsi_clear_queue_lu(struct list_head *error_q, struct scsi_cmnd list_for_each_entry_safe(cmd, n, error_q, eh_entry) { if (cmd->device->sdev_target == my_cmd->device->sdev_target && cmd->device->lun == my_cmd->device->lun) - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); } } @@ -631,12 +621,12 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * case TASK_IS_DONE: SAS_DPRINTK("%s: task 0x%p is done\n", __func__, task); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); continue; case TASK_IS_ABORTED: SAS_DPRINTK("%s: task 0x%p is aborted\n", __func__, task); - sas_eh_defer_cmd(cmd); + sas_eh_finish_cmd(cmd); continue; case TASK_IS_AT_LU: SAS_DPRINTK("task 0x%p is at LU: lu recover\n", task); @@ -647,7 +637,7 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head * "recovered\n", SAS_ADDR(task->dev),
Re: [PATCH 6/6] lib/scatterlist: Drop order argument from sgl_free_n_order
On Wed, 2018-03-07 at 12:47 +, Tvrtko Ursulin wrote: > We can derive the order from sg->length and so do not need to pass it in > explicitly. Rename the function to sgl_free_n. Using get_order() to compute the order looks fine to me but this patch will have to rebased in order to address the comments on the previous patches. Thanks, Bart.
Re: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset
On Wed, Mar 07, 2018 at 08:31:31PM +0530, Kashyap Desai wrote: > > -Original Message- > > From: Ming Lei [mailto:ming@redhat.com] > > Sent: Wednesday, March 7, 2018 10:58 AM > > To: Kashyap Desai > > Cc: Jens Axboe; linux-bl...@vger.kernel.org; Christoph Hellwig; Mike > Snitzer; > > linux-scsi@vger.kernel.org; Hannes Reinecke; Arun Easi; Omar Sandoval; > > Martin K . Petersen; James Bottomley; Christoph Hellwig; Don Brace; > Peter > > Rivera; Laurence Oberman > > Subject: Re: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance > via > > .host_tagset > > > > On Wed, Feb 28, 2018 at 08:28:48PM +0530, Kashyap Desai wrote: > > > Ming - > > > > > > Quick testing on my setup - Performance slightly degraded (4-5% > > > drop)for megaraid_sas driver with this patch. (From 1610K IOPS it goes > > > to 1544K) I confirm that after applying this patch, we have #queue = > #numa > > node. > > > > > > ls -l > > > > > > /sys/devices/pci:80/:80:02.0/:83:00.0/host10/target10:2:23/10: > > > 2:23:0/block/sdy/mq > > > total 0 > > > drwxr-xr-x. 18 root root 0 Feb 28 09:53 0 drwxr-xr-x. 18 root root 0 > > > Feb 28 09:53 1 > > > > > > > > > I would suggest to skip megaraid_sas driver changes using > > > shared_tagset until and unless there is obvious gain. If overall > > > interface of using shared_tagset is commit in kernel tree, we will > > > investigate (megaraid_sas > > > driver) in future about real benefit of using it. > > > > Hi Kashyap, > > > > Now I have put patches for removing operating on scsi_host->host_busy in > > V4[1], especially which are done in the following 3 patches: > > > > 9221638b9bc9 scsi: avoid to hold host_busy for scsi_mq > > 1ffc8c0ffbe4 scsi: read host_busy via scsi_host_busy() > > e453d3983243 scsi: introduce scsi_host_busy() > > > > > > Could you run your test on V4 and see if IOPS can be improved on > > megaraid_sas? > > > > > > [1] https://github.com/ming1/linux/commits/v4.16-rc-host-tags-v4 > > I will be doing testing soon. Today I revisit your previous perf trace too, seems the following samples take a bit more CPU: 4.64% [megaraid_sas] [k] complete_cmd_fusion ... 2.22% [megaraid_sas] [k] megasas_build_io_fusion ... 1.33% [megaraid_sas] [k] megasas_build_and_issue_cmd_fusion But V4 should get a bit improvement in theory. And if some host-wide resource of megaraid_sas can be partitioned to per-node hw queue, I guess some of improvement can be got too. > > BTW - Performance impact is due below patch only - > "[PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via > .host_tagset" > > Below patch is really needed - > "[PATCH V3 2/8] scsi: megaraid_sas: fix selection of reply queue" > > I am currently doing review on my setup. I think above patch is fixing > real issue of performance (for megaraid_sas) as driver may not be sending > IO to optimal reply queue. The ideal way is to map reply queue to blk-mq's hw queue, but seems SCSI/driver's IO path is too slow so that high enough hw queue depth(from device internal view, for example 256) still can't reach good performance, as you observed. > Having CPU to MSIx mapping will solve that. Megaraid_sas driver always > create max MSIx as min (online CPU, # MSIx HW support). > I will do more review and testing for that particular patch as well. OK, thanks! > > Also one observation using V3 series patch. I am seeing below Affinity > mapping whereas I have only 72 logical CPUs. It means we are really not > going to use all reply queues. > e.a If I bind fio jobs on CPU 18-20, I am seeing only one reply queue is > used and that may lead to performance drop as well. If the mapping is in such shape, I guess it should be quite difficult to figure out one perfect way to solve this situation because one reply queue has to handle IOs submitted from 4~5 CPUs at average. The application should have the knowledge to avoid this kind of usage. Thanks, Ming
Re: [PATCH v2] scsi: libsas: defer ata device eh commands to libata
On Tue, Mar 6, 2018 at 11:47 PM, Jason Yanwrote: > When ata device doing EH, some commands still attached with tasks are not > passed to libata when abort failed or recover failed, so libata did not > handle these commands. After these commands done, sas task is freed, but > ata qc is not freed. This will cause ata qc leak and trigger a warning > like below: > > WARNING: CPU: 0 PID: 28512 at drivers/ata/libata-eh.c:4037 > ata_eh_finish+0xb4/0xcc > CPU: 0 PID: 28512 Comm: kworker/u32:2 Tainted: G W OE 4.14.0#1 > .. > Call trace: > [] ata_eh_finish+0xb4/0xcc > [] ata_do_eh+0xc4/0xd8 > [] ata_std_error_handler+0x44/0x8c > [] ata_scsi_port_error_handler+0x480/0x694 > [] async_sas_ata_eh+0x4c/0x80 > [] async_run_entry_fn+0x4c/0x170 > [] process_one_work+0x144/0x390 > [] worker_thread+0x144/0x418 > [] kthread+0x10c/0x138 > [] ret_from_fork+0x10/0x18 > > If ata qc leaked too many, ata tag allocation will fail and io blocked > for ever. > > As suggested by Dan Williams, defer ata device commands to libata and > merge sas_eh_finish_cmd() with sas_eh_defer_cmd(). libata will handle > ata qcs correctly after this. > > Signed-off-by: Jason Yan > CC: Xiaofei Tan > CC: John Garry > CC: Dan Williams > --- > drivers/scsi/libsas/sas_scsi_host.c | 30 ++ > 1 file changed, 10 insertions(+), 20 deletions(-) > > diff --git a/drivers/scsi/libsas/sas_scsi_host.c > b/drivers/scsi/libsas/sas_scsi_host.c > index 6de9681ace82..fd76436b289c 100644 > --- a/drivers/scsi/libsas/sas_scsi_host.c > +++ b/drivers/scsi/libsas/sas_scsi_host.c > @@ -223,6 +223,7 @@ int sas_queuecommand(struct Scsi_Host *host, struct > scsi_cmnd *cmd) > static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) > { > struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(cmd->device->host); > + struct domain_device *dev = cmd_to_domain_dev(cmd); > struct sas_task *task = TO_SAS_TASK(cmd); > > /* At this point, we only get called following an actual abort > @@ -231,6 +232,11 @@ static void sas_eh_finish_cmd(struct scsi_cmnd *cmd) > */ > sas_end_task(cmd, task); This patch looks good, just need a comment here about why it is ok to short circuit this routine in the dev_is_sata() case. > > + if (dev_is_sata(dev)) { > + list_move_tail(>eh_entry, _ha->eh_ata_q); > + return; > + } > +
RE: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset
> -Original Message- > From: Ming Lei [mailto:ming@redhat.com] > Sent: Wednesday, March 7, 2018 10:58 AM > To: Kashyap Desai > Cc: Jens Axboe; linux-bl...@vger.kernel.org; Christoph Hellwig; Mike Snitzer; > linux-scsi@vger.kernel.org; Hannes Reinecke; Arun Easi; Omar Sandoval; > Martin K . Petersen; James Bottomley; Christoph Hellwig; Don Brace; Peter > Rivera; Laurence Oberman > Subject: Re: [PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via > .host_tagset > > On Wed, Feb 28, 2018 at 08:28:48PM +0530, Kashyap Desai wrote: > > Ming - > > > > Quick testing on my setup - Performance slightly degraded (4-5% > > drop)for megaraid_sas driver with this patch. (From 1610K IOPS it goes > > to 1544K) I confirm that after applying this patch, we have #queue = #numa > node. > > > > ls -l > > > /sys/devices/pci:80/:80:02.0/:83:00.0/host10/target10:2:23/10: > > 2:23:0/block/sdy/mq > > total 0 > > drwxr-xr-x. 18 root root 0 Feb 28 09:53 0 drwxr-xr-x. 18 root root 0 > > Feb 28 09:53 1 > > > > > > I would suggest to skip megaraid_sas driver changes using > > shared_tagset until and unless there is obvious gain. If overall > > interface of using shared_tagset is commit in kernel tree, we will > > investigate (megaraid_sas > > driver) in future about real benefit of using it. > > Hi Kashyap, > > Now I have put patches for removing operating on scsi_host->host_busy in > V4[1], especially which are done in the following 3 patches: > > 9221638b9bc9 scsi: avoid to hold host_busy for scsi_mq > 1ffc8c0ffbe4 scsi: read host_busy via scsi_host_busy() > e453d3983243 scsi: introduce scsi_host_busy() > > > Could you run your test on V4 and see if IOPS can be improved on > megaraid_sas? > > > [1] https://github.com/ming1/linux/commits/v4.16-rc-host-tags-v4 I will be doing testing soon. BTW - Performance impact is due below patch only - "[PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset" Below patch is really needed - "[PATCH V3 2/8] scsi: megaraid_sas: fix selection of reply queue" I am currently doing review on my setup. I think above patch is fixing real issue of performance (for megaraid_sas) as driver may not be sending IO to optimal reply queue. Having CPU to MSIx mapping will solve that. Megaraid_sas driver always create max MSIx as min (online CPU, # MSIx HW support). I will do more review and testing for that particular patch as well. Also one observation using V3 series patch. I am seeing below Affinity mapping whereas I have only 72 logical CPUs. It means we are really not going to use all reply queues. e.a If I bind fio jobs on CPU 18-20, I am seeing only one reply queue is used and that may lead to performance drop as well. PCI name is 86:00.0, dump its irq affinity: irq 218, cpu list 0-2,36-37 irq 219, cpu list 3-5,39-40 irq 220, cpu list 6-8,42-43 irq 221, cpu list 9-11,45-46 irq 222, cpu list 12-13,48-49 irq 223, cpu list 14-15,50-51 irq 224, cpu list 16-17,52-53 irq 225, cpu list 38,41,44,47 irq 226, cpu list 72,74,76,78 irq 227, cpu list 80,82,84,86 irq 228, cpu list 88,90,92,94 irq 229, cpu list 96,98,100,102 irq 230, cpu list 104,106,108,110 irq 231, cpu list 112,114,116,118 irq 232, cpu list 120,122,124,126 irq 233, cpu list 128,130,132,134 irq 234, cpu list 136,138,140,142 irq 235, cpu list 144,146,148,150 irq 236, cpu list 152,154,156,158 irq 237, cpu list 160,162,164,166 irq 238, cpu list 168,170,172,174 irq 239, cpu list 176,178,180,182 irq 240, cpu list 184,186,188,190 irq 241, cpu list 192,194,196,198 irq 242, cpu list 200,202,204,206 irq 243, cpu list 208,210,212,214 irq 244, cpu list 216,218,220,222 irq 245, cpu list 224,226,228,230 irq 246, cpu list 232,234,236,238 irq 247, cpu list 240,242,244,246 irq 248, cpu list 248,250,252,254 irq 249, cpu list 256,258,260,262 irq 250, cpu list 264,266,268,270 irq 251, cpu list 272,274,276,278 irq 252, cpu list 280,282,284,286 irq 253, cpu list 288,290,292,294 irq 254, cpu list 18-20,54-55 irq 255, cpu list 21-23,57-58 irq 256, cpu list 24-26,60-61 irq 257, cpu list 27-29,63-64 irq 258, cpu list 30-31,66-67 irq 259, cpu list 32-33,68-69 irq 260, cpu list 34-35,70-71 irq 261, cpu list 56,59,62,65 irq 262, cpu list 73,75,77,79 irq 263, cpu list 81,83,85,87 irq 264, cpu list 89,91,93,95 irq 265, cpu list 97,99,101,103 irq 266, cpu list 105,107,109,111 irq 267, cpu list 113,115,117,119 irq 268, cpu list 121,123,125,127 irq 269, cpu list 129,131,133,135 irq 270, cpu list 137,139,141,143 irq 271, cpu list 145,147,149,151 irq 272, cpu list 153,155,157,159 irq 273, cpu list 161,163,165,167 irq 274, cpu list 169,171,173,175 irq 275, cpu list 177,179,181,183 irq 276, cpu list 185,187,189,191 irq 277, cpu list 193,195,197,199 irq 278, cpu list 201,203,205,207 irq 279, cpu list 209,211,213,215 irq 280, cpu list 217,219,221,223 irq 281, cpu list 225,227,229,231 irq 282, cpu list 233,235,237,239 irq 283, cpu list 241,243,245,247 irq 284, cpu list 249,251,253,255 irq 285, cpu list
Re: [PATCH V3 1/8] scsi: hpsa: fix selection of reply queue
On Tue, 2018-03-06 at 14:24 -0500, Martin K. Petersen wrote: > Ming, > > > Given both Don and Laurence have verified that patch 1 and patch 2 > > does fix IO hang, could you consider to merge the two first? > > Oh, and I would still need a formal Acked-by: from Don and Tested-by: > from Laurence. > > Also, for 4.16/scsi-fixes I would prefer verification to be done with > just patch 1/8 and none of the subsequent changes in place. Just to > make > sure we're testing the right thing. > > Thanks! > Hello Martin I tested just Patch 1/8 from the V3 series. No issues running workload and no issues booting on the DL380G7. Don can you ack this so we can at least get this one in. Against: 4.16.0-rc4.v31of8+ on an x86_64 Tested-by: Laurence ObermanThanks Laurence
Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"
2018-03-07 14:51 GMT+01:00 Steffen Maier: > > On 03/07/2018 09:24 AM, Menion wrote: >> > ... > > but from then on, you only get it roughly once every 300 seconds, i.e. 5 > minutes > > that's where I suspect user space as trigger, unless there is a kernel > feature I'm not aware of doing such sdev rescans > > preventing this would be a workaround > Is it possible that it is smartd? It is the only daemon that could do some low level access to the device (bypassing the filesystem) > >> /* >> * Many devices do not respond properly to >> READ_CAPACITY_16. >> * Tell the SCSI layer to try READ_CAPACITY_10 first. >> * However some USB 3.0 drive enclosures return capacity >> * modulo 2TB. Those must use READ_CAPACITY_16 >> */ >> if (!(us->fflags & US_FL_NEEDS_CAP16)) >> sdev->try_rc_10_first = 1; > > > if that's the cause, maybe an entry in drivers/usb/storage/unusual_devs.h > would help, but that's really just guessing as I'm not familiar with USB > It seems that the bridge does have an entry in unusual_devs.h: /* Reported by Michael Büsch */ UNUSUAL_DEV( 0x152d, 0x0567, 0x0114, 0x0116, "JMicron", "USB to ATA/ATAPI Bridge", USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_BROKEN_FUA ), VID:PID is 0x152d 0x0567, not sure what are the other two numbers, so I went back and used another enclosure with same USB to SATA bridge. The strange thing is that this other enclosure goes in UAS mode while the one for which I am reporting the issue goes in usb-storage mode because it gets somehow the quirks 0x5000 Unfortunately I cannot move these 5 HDDs in the other enclosure. So do you think that it shall be reported to linux-usb maybe?
Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"
On 03/07/2018 09:24 AM, Menion wrote: By flooded I mean that it continously fill the dmesg log with no interruption, check attached a log that I have just taken from my server Some more details on my setup. I have these 5 HDD, WD RED 8TB in an Orico 5 bay enclosure, running JMS567 USBtoSATA bridge and an internal SATA multiplexer This is connected to the USB 3.0 host port of my server, it is an Intel Atom 2018-03-07 3:45 GMT+01:00 Martin K. Petersen: Also, what kind of controller are these disks attached to? The reason you see these messages is that to the kernel it looks like a legacy disk device that predates capacities in the TB range. The warnings are logged because we're surprised to be going down this path based on what the device has previously told us. Of course Martin's statement regarding the occurrence holds true. It does not look like continuously flooding, but rather like a repetition at some not even high frequency. Do you have some user space periodically performing SCSI target or SCSI device rescans? Each repetition is per drive, i.e. a junk of 5 messages in your case. [4.929517] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). first occurrence after initial probing [4.933893] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [4.946474] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). looks like we go through the code path more than once during initial probing [ 99.057592] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [ 409.335119] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [ 719.760106] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [ 1018.089562] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). [ 1328.086120] sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16). ... but from then on, you only get it roughly once every 300 seconds, i.e. 5 minutes that's where I suspect user space as trigger, unless there is a kernel feature I'm not aware of doing such sdev rescans preventing this would be a workaround assuming the Linux check is correct, the proper fix might be that the device should present itself according to standards such that Linux silently uses READ CAPACITY(16) in the first place static int sd_try_rc16_first(struct scsi_device *sdp) { if (sdp->host->max_cmd_len < 16) return 0; option if (sdp->try_rc_10_first) return 0; option if (sdp->scsi_level > SCSI_SPC_2) return 1; if (scsi_device_protection(sdp)) return 1; return 0; option } just picking one arbitrary option and not being entirely sure that's the code path but you mentioned USB to SATA bridge, it might be related to: *** drivers/usb/storage/scsiglue.c: slave_configure[239] sdev->try_rc_10_first = 1; /* * Many devices do not respond properly to READ_CAPACITY_16. * Tell the SCSI layer to try READ_CAPACITY_10 first. * However some USB 3.0 drive enclosures return capacity * modulo 2TB. Those must use READ_CAPACITY_16 */ if (!(us->fflags & US_FL_NEEDS_CAP16)) sdev->try_rc_10_first = 1; if that's the cause, maybe an entry in drivers/usb/storage/unusual_devs.h would help, but that's really just guessing as I'm not familiar with USB -- Mit freundlichen Grüßen / Kind regards Steffen Maier Linux on z Systems Development IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
[bug report] scsi: lpfc: Add WQ Full Logic for NVME Target
Hello James Smart, The patch 6e8e1c14c61e: "scsi: lpfc: Add WQ Full Logic for NVME Target" from Jan 30, 2018, leads to the following static checker warning: drivers/scsi/lpfc/lpfc_nvmet.c:921 lpfc_nvmet_xmt_fcp_abort() warn: inconsistent returns 'ctxp->ctxlock'. drivers/scsi/lpfc/lpfc_nvmet.c 889 atomic_inc(_nvmep->xmt_fcp_abort); 890 891 spin_lock_irqsave(>ctxlock, flags); 892 ctxp->state = LPFC_NVMET_STE_ABORT; 893 894 /* Since iaab/iaar are NOT set, we need to check 895 * if the firmware is in process of aborting IO 896 */ 897 if (ctxp->flag & LPFC_NVMET_XBUSY) { 898 spin_unlock_irqrestore(>ctxlock, flags); 899 return; 900 } 901 ctxp->flag |= LPFC_NVMET_ABORT_OP; 902 903 if (ctxp->flag & LPFC_NVMET_DEFER_WQFULL) { 904 lpfc_nvmet_unsol_fcp_issue_abort(phba, ctxp, ctxp->sid, 905 ctxp->oxid); 906 wq = phba->sli4_hba.nvme_wq[ctxp->wqeq->hba_wqidx]; 907 lpfc_nvmet_wqfull_flush(phba, wq, ctxp); 908 return; ^^^ Missing unlock before this return. 909 } 910 911 /* An state of LPFC_NVMET_STE_RCV means we have just received 912 * the NVME command and have not started processing it. 913 * (by issuing any IO WQEs on this exchange yet) 914 */ 915 if (ctxp->state == LPFC_NVMET_STE_RCV) 916 lpfc_nvmet_unsol_fcp_issue_abort(phba, ctxp, ctxp->sid, 917 ctxp->oxid); 918 else 919 lpfc_nvmet_sol_fcp_issue_abort(phba, ctxp, ctxp->sid, 920 ctxp->oxid); 921 spin_unlock_irqrestore(>ctxlock, flags); 922 } regards, dan carpenter
[PATCH 6/6] lib/scatterlist: Drop order argument from sgl_free_n_order
From: Tvrtko UrsulinWe can derive the order from sg->length and so do not need to pass it in explicitly. Rename the function to sgl_free_n. Signed-off-by: Tvrtko Ursulin Cc: Bart Van Assche Cc: Hannes Reinecke Cc: Johannes Thumshirn Cc: Jens Axboe Cc: "Nicholas A. Bellinger" Cc: linux-scsi@vger.kernel.org Cc: target-de...@vger.kernel.org --- drivers/target/target_core_transport.c | 2 +- include/linux/scatterlist.h| 5 ++--- lib/scatterlist.c | 16 ++-- 3 files changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 4558f2e1fe1b..91e8f4047492 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -2303,7 +2303,7 @@ static void target_complete_ok_work(struct work_struct *work) void target_free_sgl(struct scatterlist *sgl, int nents) { - sgl_free_n_order(sgl, nents, 0); + sgl_free_n(sgl, nents); } EXPORT_SYMBOL(target_free_sgl); diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 3ffc5f3bf181..3779d1fdd5c4 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -280,8 +280,7 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, struct scatterlist *sgl_alloc_order(unsigned long length, unsigned int order, bool chainable, gfp_t gfp, unsigned int *nent_p); -void sgl_free_n_order(struct scatterlist *sgl, unsigned int nents, - unsigned int order); +void sgl_free_n(struct scatterlist *sgl, unsigned int nents); /** * sgl_alloc - allocate a scatterlist and its pages @@ -303,7 +302,7 @@ sgl_alloc(unsigned long length, gfp_t gfp, unsigned int *nent_p) */ static inline void sgl_free(struct scatterlist *sgl) { - sgl_free_n_order(sgl, UINT_MAX, 0); + sgl_free_n(sgl, UINT_MAX); } #endif /* CONFIG_SGL_ALLOC */ diff --git a/lib/scatterlist.c b/lib/scatterlist.c index c637849482d3..76111e91a038 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -493,7 +493,7 @@ struct scatterlist *sgl_alloc_order(unsigned long length, unsigned int order, { unsigned int chunk_len = PAGE_SIZE << order; struct scatterlist *sgl, *sg; - unsigned int nent, i; + unsigned int nent; nent = round_up(length, chunk_len) >> (PAGE_SHIFT + order); @@ -517,12 +517,11 @@ struct scatterlist *sgl_alloc_order(unsigned long length, unsigned int order, sg_init_table(sgl, nent); sg = sgl; - i = 0; while (length) { struct page *page = alloc_pages(gfp, order); if (!page) { - sgl_free_n_order(sgl, i, order); + sgl_free(sgl); return NULL; } @@ -530,7 +529,6 @@ struct scatterlist *sgl_alloc_order(unsigned long length, unsigned int order, sg_set_page(sg, page, chunk_len, 0); length -= chunk_len; sg = sg_next(sg); - i++; } WARN_ONCE(length, "length = %ld\n", length); return sgl; @@ -538,10 +536,9 @@ struct scatterlist *sgl_alloc_order(unsigned long length, unsigned int order, EXPORT_SYMBOL(sgl_alloc_order); /** - * sgl_free_n_order - free a scatterlist and its pages + * sgl_free_n - free a scatterlist and its pages * @sgl: Scatterlist with one or more elements * @nents: Maximum number of elements to free - * @order: Second argument for __free_pages() * * Notes: * - If several scatterlists have been chained and each chain element is @@ -550,8 +547,7 @@ EXPORT_SYMBOL(sgl_alloc_order); * - All pages in a chained scatterlist can be freed at once by setting @nents * to a high number. */ -void sgl_free_n_order(struct scatterlist *sgl, unsigned int nents, - unsigned int order) +void sgl_free_n(struct scatterlist *sgl, unsigned int nents) { struct scatterlist *sg; struct page *page; @@ -562,11 +558,11 @@ void sgl_free_n_order(struct scatterlist *sgl, unsigned int nents, break; page = sg_page(sg); if (page) - __free_pages(page, order); + __free_pages(page, get_order(sg->length)); } kfree(sgl); } -EXPORT_SYMBOL(sgl_free_n_order); +EXPORT_SYMBOL(sgl_free_n); #endif /* CONFIG_SGL_ALLOC */ -- 2.14.1
[PATCH] scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled
iscsi tcp will first send out data, then calculate and send data digest. If we don't have BDI_CAP_STABLE_WRITES, the page cache will be written in spite of the on going writeback. Consequently, wrong digest will be got and sent to target. To fix this, set BDI_CAP_STABLE_WRITES when data digest is enabled in iscsi_tcp .slave_configure callback. Signed-off-by: Jianchao Wang--- drivers/scsi/iscsi_tcp.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 6198559..261c686 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include @@ -954,6 +955,12 @@ static int iscsi_sw_tcp_slave_alloc(struct scsi_device *sdev) static int iscsi_sw_tcp_slave_configure(struct scsi_device *sdev) { + struct iscsi_sw_tcp_host *tcp_sw_host = iscsi_host_priv(sdev->host); + struct iscsi_session *session = tcp_sw_host->session; + struct iscsi_conn *conn = session->leadconn; + + if (conn->datadgst_en) + sdev->request_queue->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES; blk_queue_bounce_limit(sdev->request_queue, BLK_BOUNCE_ANY); blk_queue_dma_alignment(sdev->request_queue, 0); return 0; -- 2.7.4
[PATCH v4 0/10] hisi_sas: support x6000 board and some misc changes
This patchset primarily adds support for the Huawei x6000 board, which includes hip07 chipset. Unfortunately, due to some board layout differences with our development board, we need to set a PHY-related register differently for optimal signal quality. As such, a signal attenuation property is added to describe the differences in the boards and allow the PHY register to be set appropriately. In addition to this above features, some misc changes are added for: - PHY linkrate sysfs interface - linkrate set function - internal abort timer timeout increase - add module device id tabe for v3 hw - register init setting changes Differences to v3: - address comments from Hannes on patch #8: https://marc.info/?l=linux-scsi=152033924615703=2 - add patch for module Id table and some register setting for v3 HW. Differences to v2: - rename dt binding property name to "hisilicon,signal-attenuation" Differences to v1: - rename dt binding property name to include "hisi-" prefix Xiang Chen (3): scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req scsi: hisi_sas: Code cleanup and minor bug fixes scsi: hisi_sas: add v3 hw MODULE_DEVICE_TABLE() Xiaofei Tan (7): dt-bindings: scsi: hisi_sas: add an property of signal attenuation scsi: hisi_sas: support the property of signal attenuation for v2 hw scsi: hisi_sas: fix the issue of link rate inconsistency scsi: hisi_sas: fix the issue of setting linkrate register scsi: hisi_sas: increase timer expire of internal abort task scsi: hisi_sas: fix return value of hisi_sas_task_prep() scsi: hisi_sas: modify some register config for hip08 .../devicetree/bindings/scsi/hisilicon-sas.txt | 7 +++ drivers/scsi/hisi_sas/hisi_sas.h | 1 - drivers/scsi/hisi_sas/hisi_sas_main.c | 34 +- drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 13 ++-- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 62 ++- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 72 ++ 6 files changed, 109 insertions(+), 80 deletions(-) -- 1.9.1
[PATCH v4 02/10] scsi: hisi_sas: support the property of signal attenuation for v2 hw
From: Xiaofei TanThe register SAS_PHY_CTRL is configured according to signal quality. The signal quality is calculated by signal attenuation of hardware physical link. It may be different for different PCB layout. So, in order to give better support to new board, this patch add support to reading the devicetree property, "hisilicon,signal-attenuation". Of course, we still keep an default value in driver to adapt old board. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 39 +- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index 4ccb61e..42b3fd6 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -406,6 +406,17 @@ struct hisi_sas_err_record_v2 { __le32 dma_rx_err_type; }; +struct signal_attenuation_s { + u32 de_emphasis; + u32 preshoot; + u32 boost; +}; + +struct sig_atten_lu_s { + const struct signal_attenuation_s *att; + u32 sas_phy_ctrl; +}; + static const struct hisi_sas_hw_error one_bit_ecc_errors[] = { { .irq_msk = BIT(SAS_ECC_INTR_DQE_ECC_1B_OFF), @@ -1130,9 +1141,16 @@ static void phys_try_accept_stp_links_v2_hw(struct hisi_hba *hisi_hba) } } +static const struct signal_attenuation_s x6000 = {9200, 0, 10476}; +static const struct sig_atten_lu_s sig_atten_lu[] = { + { , 0x3016a68 }, +}; + static void init_reg_v2_hw(struct hisi_hba *hisi_hba) { struct device *dev = hisi_hba->dev; + u32 sas_phy_ctrl = 0x30b9908; + u32 signal[3]; int i; /* Global registers init */ @@ -1176,9 +1194,28 @@ static void init_reg_v2_hw(struct hisi_hba *hisi_hba) hisi_sas_write32(hisi_hba, AXI_AHB_CLK_CFG, 1); hisi_sas_write32(hisi_hba, HYPER_STREAM_ID_EN_CFG, 1); + /* Get sas_phy_ctrl value to deal with TX FFE issue. */ + if (!device_property_read_u32_array(dev, "hisilicon,signal-attenuation", + signal, ARRAY_SIZE(signal))) { + for (i = 0; i < ARRAY_SIZE(sig_atten_lu); i++) { + const struct sig_atten_lu_s *lookup = _atten_lu[i]; + const struct signal_attenuation_s *att = lookup->att; + + if ((signal[0] == att->de_emphasis) && + (signal[1] == att->preshoot) && + (signal[2] == att->boost)) { + sas_phy_ctrl = lookup->sas_phy_ctrl; + break; + } + } + + if (i == ARRAY_SIZE(sig_atten_lu)) + dev_warn(dev, "unknown signal attenuation values, using default PHY ctrl config\n"); + } + for (i = 0; i < hisi_hba->n_phy; i++) { hisi_sas_phy_write32(hisi_hba, i, PROG_PHY_LINK_RATE, 0x855); - hisi_sas_phy_write32(hisi_hba, i, SAS_PHY_CTRL, 0x30b9908); + hisi_sas_phy_write32(hisi_hba, i, SAS_PHY_CTRL, sas_phy_ctrl); hisi_sas_phy_write32(hisi_hba, i, SL_TOUT_CFG, 0x7d7d7d7d); hisi_sas_phy_write32(hisi_hba, i, SL_CONTROL, 0x0); hisi_sas_phy_write32(hisi_hba, i, TXID_AUTO, 0x2); -- 1.9.1
[PATCH v4 06/10] scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
From: Xiang ChenThe structure element hisi_sas_devices.running_req to count how many commands are active is in effect only ever written in the code, so remove it. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas.h | 1 - drivers/scsi/hisi_sas/hisi_sas_main.c | 9 - drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 3 --- 3 files changed, 13 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index e7fd287..d1153e8 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -175,7 +175,6 @@ struct hisi_sas_device { struct hisi_sas_dq *dq; struct list_headlist; u64 attached_phy; - atomic64_t running_req; enum sas_device_typedev_type; int device_id; int sata_idx; diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 9ff8790..88ad8d4 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -200,8 +200,6 @@ void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task, if (task) { struct device *dev = hisi_hba->dev; - struct domain_device *device = task->dev; - struct hisi_sas_device *sas_dev = device->lldd_dev; if (!task->lldd_task) return; @@ -213,9 +211,6 @@ void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task, dma_unmap_sg(dev, task->scatter, task->num_scatter, task->data_dir); - - if (sas_dev) - atomic64_dec(_dev->running_req); } if (slot->buf) @@ -431,8 +426,6 @@ static int hisi_sas_task_prep(struct sas_task *task, struct hisi_sas_dq spin_unlock_irqrestore(>task_state_lock, flags); dq->slot_prep = slot; - - atomic64_inc(_dev->running_req); ++(*pass); return 0; @@ -1517,8 +1510,6 @@ static int hisi_sas_query_task(struct sas_task *task) dq->slot_prep = slot; - atomic64_inc(_dev->running_req); - /* send abort command to the chip */ hisi_hba->hw->start_delivery(dq); spin_unlock_irqrestore(>lock, flags_dq); diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c index 2eb8980..8dd0e6a6 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c @@ -1407,9 +1407,6 @@ static int slot_complete_v1_hw(struct hisi_hba *hisi_hba, } out: - if (sas_dev) - atomic64_dec(_dev->running_req); - hisi_sas_slot_task_free(hisi_hba, task, slot); sts = ts->stat; -- 1.9.1
[PATCH v4 03/10] scsi: hisi_sas: fix the issue of link rate inconsistency
From: Xiaofei TanIn sysfs, there are two files about minimum linkrate, and also two files for maximum linkrate. Take maximum linkrate example, maximum_linkrate_hw is read-only and indicated by the register HARD_PHY_LINKRATE, and maximum_linkrate is read-write and corresponding to the register PROG_PHY_LINK_RATE. But in the function phy_up_v*_hw(), we get *_linkrate value from HARD_PHY_LINKRATE. It is not right. This patch is to fix this issue. Unreferenced PHY-interrupt enum is also removed for v3 hw. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 2 ++ drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 1 - drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 8 +--- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 13 + 4 files changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 2d4dbed..9d16372 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -683,6 +683,8 @@ static void hisi_sas_phy_init(struct hisi_hba *hisi_hba, int phy_no) phy->hisi_hba = hisi_hba; phy->port = NULL; + phy->minimum_linkrate = SAS_LINK_RATE_1_5_GBPS; + phy->maximum_linkrate = hisi_hba->hw->phy_get_max_linkrate(); sas_phy->enabled = (phy_no < hisi_hba->n_phy) ? 1 : 0; sas_phy->class = SAS; sas_phy->iproto = SAS_PROTOCOL_ALL; diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c index 679e76f..38bbda9 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c @@ -873,7 +873,6 @@ static void phy_set_linkrate_v1_hw(struct hisi_hba *hisi_hba, int phy_no, sas_phy->phy->maximum_linkrate = max; sas_phy->phy->minimum_linkrate = min; - min -= SAS_LINK_RATE_1_5_GBPS; max -= SAS_LINK_RATE_1_5_GBPS; for (i = 0; i <= max; i++) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index 42b3fd6..67be346 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -1603,7 +1603,6 @@ static void phy_set_linkrate_v2_hw(struct hisi_hba *hisi_hba, int phy_no, sas_phy->phy->maximum_linkrate = max; sas_phy->phy->minimum_linkrate = min; - min -= SAS_LINK_RATE_1_5_GBPS; max -= SAS_LINK_RATE_1_5_GBPS; for (i = 0; i <= max; i++) @@ -2684,7 +2683,7 @@ static int prep_abort_v2_hw(struct hisi_hba *hisi_hba, static int phy_up_v2_hw(int phy_no, struct hisi_hba *hisi_hba) { int i, res = IRQ_HANDLED; - u32 port_id, link_rate, hard_phy_linkrate; + u32 port_id, link_rate; struct hisi_sas_phy *phy = _hba->phy[phy_no]; struct asd_sas_phy *sas_phy = >sas_phy; struct device *dev = hisi_hba->dev; @@ -2723,11 +2722,6 @@ static int phy_up_v2_hw(int phy_no, struct hisi_hba *hisi_hba) } sas_phy->linkrate = link_rate; - hard_phy_linkrate = hisi_sas_phy_read32(hisi_hba, phy_no, - HARD_PHY_LINKRATE); - phy->maximum_linkrate = hard_phy_linkrate & 0xf; - phy->minimum_linkrate = (hard_phy_linkrate >> 4) & 0xf; - sas_phy->oob_mode = SAS_OOB_MODE; memcpy(sas_phy->attached_sas_addr, >sas_addr, SAS_ADDR_SIZE); dev_info(dev, "phyup: phy%d link_rate=%d\n", phy_no, link_rate); diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index a1f1868..1ee95ab 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -340,12 +340,6 @@ struct hisi_sas_err_record_v3 { #define HISI_SAS_COMMAND_ENTRIES_V3_HW 4096 #define HISI_SAS_MSI_COUNT_V3_HW 32 -enum { - HISI_SAS_PHY_PHY_UPDOWN, - HISI_SAS_PHY_CHNL_INT, - HISI_SAS_PHY_INT_NR -}; - #define DIR_NO_DATA 0 #define DIR_TO_INI 1 #define DIR_TO_DEVICE 2 @@ -1121,7 +1115,7 @@ static int prep_abort_v3_hw(struct hisi_hba *hisi_hba, static int phy_up_v3_hw(int phy_no, struct hisi_hba *hisi_hba) { int i, res = 0; - u32 context, port_id, link_rate, hard_phy_linkrate; + u32 context, port_id, link_rate; struct hisi_sas_phy *phy = _hba->phy[phy_no]; struct asd_sas_phy *sas_phy = >sas_phy; struct device *dev = hisi_hba->dev; @@ -1139,10 +1133,6 @@ static int phy_up_v3_hw(int phy_no, struct hisi_hba *hisi_hba) goto end; } sas_phy->linkrate = link_rate; - hard_phy_linkrate = hisi_sas_phy_read32(hisi_hba, phy_no, - HARD_PHY_LINKRATE); - phy->maximum_linkrate = hard_phy_linkrate & 0xf; - phy->minimum_linkrate = (hard_phy_linkrate >> 4) & 0xf; phy->phy_type &= ~(PORT_TYPE_SAS | PORT_TYPE_SATA); /* Check for SATA
[PATCH v4 09/10] scsi: hisi_sas: modify some register config for hip08
From: Xiaofei TanDo some modifications for register configuring for hip08. In future, to reduce kernel churn with patches to modify registers, any registers which may change between board models (mostly PHY/SERDES related) should be set in ACPI reset handler. Signed-off-by: Xiaofei Tan Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 4023fcb..5ce5ef2c 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -172,6 +172,7 @@ #define CHL_INT1_MSK (PORT_BASE + 0x1c4) #define CHL_INT2_MSK (PORT_BASE + 0x1c8) #define CHL_INT_COAL_EN(PORT_BASE + 0x1d0) +#define SAS_RX_TRAIN_TIMER (PORT_BASE + 0x2a4) #define PHY_CTRL_RDY_MSK (PORT_BASE + 0x2b0) #define PHYCTRL_NOT_RDY_MSK(PORT_BASE + 0x2b4) #define PHYCTRL_DWS_RESET_MSK (PORT_BASE + 0x2b8) @@ -184,6 +185,8 @@ #define DMA_RX_STATUS (PORT_BASE + 0x2e8) #define DMA_RX_STATUS_BUSY_OFF 0 #define DMA_RX_STATUS_BUSY_MSK (0x1 << DMA_RX_STATUS_BUSY_OFF) + +#define COARSETUNE_TIME(PORT_BASE + 0x304) #define ERR_CNT_DWS_LOST (PORT_BASE + 0x380) #define ERR_CNT_RESET_PROB (PORT_BASE + 0x384) #define ERR_CNT_INVLD_DW (PORT_BASE + 0x390) @@ -417,10 +420,10 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba) hisi_sas_write32(hisi_hba, OQ0_INT_SRC_MSK+0x4*i, 0); hisi_sas_write32(hisi_hba, HYPER_STREAM_ID_EN_CFG, 1); - hisi_sas_write32(hisi_hba, AXI_MASTER_CFG_BASE, 0x3); for (i = 0; i < hisi_hba->n_phy; i++) { - hisi_sas_phy_write32(hisi_hba, i, PROG_PHY_LINK_RATE, 0x801); + hisi_sas_phy_write32(hisi_hba, i, PROG_PHY_LINK_RATE, 0x855); + hisi_sas_phy_write32(hisi_hba, i, SAS_RX_TRAIN_TIMER, 0x13e80); hisi_sas_phy_write32(hisi_hba, i, CHL_INT0, 0x); hisi_sas_phy_write32(hisi_hba, i, CHL_INT1, 0x); hisi_sas_phy_write32(hisi_hba, i, CHL_INT2, 0x); @@ -432,17 +435,13 @@ static void init_reg_v3_hw(struct hisi_hba *hisi_hba) hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_DWS_RESET_MSK, 0x0); hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_PHY_ENA_MSK, 0x0); hisi_sas_phy_write32(hisi_hba, i, SL_RX_BCAST_CHK_MSK, 0x0); - hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x0); - hisi_sas_phy_write32(hisi_hba, i, PHY_CTRL, 0x199b4fa); - hisi_sas_phy_write32(hisi_hba, i, SAS_SSP_CON_TIMER_CFG, -0xa03e8); - hisi_sas_phy_write32(hisi_hba, i, SAS_STP_CON_TIMER_CFG, -0xa03e8); - hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, -0x7f7a120); - hisi_sas_phy_write32(hisi_hba, i, CON_CFG_DRIVER, -0x2a0a80); + hisi_sas_phy_write32(hisi_hba, i, PHYCTRL_OOB_RESTART_MSK, 0x1); + hisi_sas_phy_write32(hisi_hba, i, STP_LINK_TIMER, 0x7f7a120); + + /* used for 12G negotiate */ + hisi_sas_phy_write32(hisi_hba, i, COARSETUNE_TIME, 0x1e); } + for (i = 0; i < hisi_hba->queue_count; i++) { /* Delivery queue */ hisi_sas_write32(hisi_hba, -- 1.9.1
[PATCH v4 05/10] scsi: hisi_sas: increase timer expire of internal abort task
From: Xiaofei TanThe current 110ms expiry time is not long enough for the internal abort task. The reason is that the internal abort task could be blocked in HW if the HW is retrying to set up link. The internal abort task will be executed only when the retry process finished. The maximum time is 5s for the retry of setting up link. So, the timer expire should be more than 5s. This patch increases it from 110ms to 6s. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 9d16372..9ff8790 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -871,6 +871,7 @@ static void hisi_sas_tmf_timedout(struct timer_list *t) #define TASK_TIMEOUT 20 #define TASK_RETRY 3 +#define INTERNAL_ABORT_TIMEOUT 6 static int hisi_sas_exec_internal_tmf_task(struct domain_device *device, void *parameter, u32 para_len, struct hisi_sas_tmf_task *tmf) @@ -1574,7 +1575,7 @@ static int hisi_sas_query_task(struct sas_task *task) task->task_proto = device->tproto; task->task_done = hisi_sas_task_done; task->slow_task->timer.function = hisi_sas_tmf_timedout; - task->slow_task->timer.expires = jiffies + msecs_to_jiffies(110); + task->slow_task->timer.expires = jiffies + INTERNAL_ABORT_TIMEOUT*HZ; add_timer(>slow_task->timer); res = hisi_sas_internal_abort_task_exec(hisi_hba, sas_dev->device_id, -- 1.9.1
[PATCH v4 10/10] scsi: hisi_sas: add v3 hw MODULE_DEVICE_TABLE()
From: Xiang ChenExport device table of v3 hw to userspace, or auto probe will fail for v3 hw. Also change the module alias to include "pci", instead of "platform". Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 5ce5ef2c..6f3e5ba 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -2394,6 +2394,7 @@ static int hisi_sas_v3_resume(struct pci_dev *pdev) { PCI_VDEVICE(HUAWEI, 0xa230), hip08 }, {} }; +MODULE_DEVICE_TABLE(pci, sas_v3_pci_table); static const struct pci_error_handlers hisi_sas_err_handler = { .error_detected = hisi_sas_error_detected_v3_hw, @@ -2416,4 +2417,4 @@ static int hisi_sas_v3_resume(struct pci_dev *pdev) MODULE_LICENSE("GPL"); MODULE_AUTHOR("John Garry "); MODULE_DESCRIPTION("HISILICON SAS controller v3 hw driver based on pci device"); -MODULE_ALIAS("platform:" DRV_NAME); +MODULE_ALIAS("pci:" DRV_NAME); -- 1.9.1
[PATCH v4 08/10] scsi: hisi_sas: Code cleanup and minor bug fixes
From: Xiang ChenThe patch does some code cleanup and fixes some small bugs: - Correct return status of phy_up_v3_hw() and phy_bcast_v3_hw() - Add static for function phy_get_max_linkrate_v3_hw() - Change exception return status when no reset method - Change magic value to ts->stat in slot_complete_vx_hw() - Remove unnecessary check for dev_is_sata() - Fix some issues of alignment and indents (Authored by Xiaofei Tan in another patch, but added here to be practical) Signed-off-by: Xiaofei Tan Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 14 +++--- drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 4 +++- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 10 ++ drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 28 +--- 4 files changed, 33 insertions(+), 23 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index dff9723..49c1fa6 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -33,7 +33,7 @@ u8 hisi_sas_get_ata_protocol(struct host_to_dev_fis *fis, int direction) case ATA_CMD_FPDMA_RECV: case ATA_CMD_FPDMA_SEND: case ATA_CMD_NCQ_NON_DATA: - return HISI_SAS_SATA_PROTOCOL_FPDMA; + return HISI_SAS_SATA_PROTOCOL_FPDMA; case ATA_CMD_DOWNLOAD_MICRO: case ATA_CMD_ID_ATA: @@ -45,7 +45,7 @@ u8 hisi_sas_get_ata_protocol(struct host_to_dev_fis *fis, int direction) case ATA_CMD_WRITE_LOG_EXT: case ATA_CMD_PIO_WRITE: case ATA_CMD_PIO_WRITE_EXT: - return HISI_SAS_SATA_PROTOCOL_PIO; + return HISI_SAS_SATA_PROTOCOL_PIO; case ATA_CMD_DSM: case ATA_CMD_DOWNLOAD_MICRO_DMA: @@ -64,7 +64,7 @@ u8 hisi_sas_get_ata_protocol(struct host_to_dev_fis *fis, int direction) case ATA_CMD_WRITE_LOG_DMA_EXT: case ATA_CMD_WRITE_STREAM_DMA_EXT: case ATA_CMD_ZAC_MGMT_IN: - return HISI_SAS_SATA_PROTOCOL_DMA; + return HISI_SAS_SATA_PROTOCOL_DMA; case ATA_CMD_CHK_POWER: case ATA_CMD_DEV_RESET: @@ -77,21 +77,21 @@ u8 hisi_sas_get_ata_protocol(struct host_to_dev_fis *fis, int direction) case ATA_CMD_STANDBY: case ATA_CMD_STANDBYNOW1: case ATA_CMD_ZAC_MGMT_OUT: - return HISI_SAS_SATA_PROTOCOL_NONDATA; + return HISI_SAS_SATA_PROTOCOL_NONDATA; default: { if (fis->command == ATA_CMD_SET_MAX) { switch (fis->features) { case ATA_SET_MAX_PASSWD: case ATA_SET_MAX_LOCK: - return HISI_SAS_SATA_PROTOCOL_PIO; + return HISI_SAS_SATA_PROTOCOL_PIO; case ATA_SET_MAX_PASSWD_DMA: case ATA_SET_MAX_UNLOCK_DMA: - return HISI_SAS_SATA_PROTOCOL_DMA; + return HISI_SAS_SATA_PROTOCOL_DMA; default: - return HISI_SAS_SATA_PROTOCOL_NONDATA; + return HISI_SAS_SATA_PROTOCOL_NONDATA; } } if (direction == DMA_NONE) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c index 8dd0e6a6..84a0ccc 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c @@ -651,8 +651,10 @@ static int reset_hw_v1_hw(struct hisi_hba *hisi_hba) dev_err(dev, "De-reset failed\n"); return -EIO; } - } else + } else { dev_warn(dev, "no reset method\n"); + return -EINVAL; + } return 0; } diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index bd1a48a..f89fb9a 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -1095,8 +1095,10 @@ static int reset_hw_v2_hw(struct hisi_hba *hisi_hba) dev_err(dev, "SAS de-reset fail.\n"); return -EIO; } - } else - dev_warn(dev, "no reset method\n"); + } else { + dev_err(dev, "no reset method\n"); + return -EINVAL; + } return 0; } @@ -2408,7 +2410,7 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba, spin_lock_irqsave(_hba->lock, flags); hisi_sas_slot_task_free(hisi_hba, task, slot); spin_unlock_irqrestore(_hba->lock, flags); - return -1; + return ts->stat; } if (unlikely(!sas_dev)) { @@ -2667,7 +2669,7 @@ static int prep_abort_v2_hw(struct hisi_hba *hisi_hba, /*
[PATCH v4 01/10] dt-bindings: scsi: hisi_sas: add an property of signal attenuation
From: Xiaofei TanFor some new boards with hip07 chipset we are required to set PHY config registers differently. The hw property which determines how to set these registers is in the PHY signal attenuation readings. This patch add an devicetree property, "hisilicon,signal-attenuation", which is used to describe the signal attenuation of an board. Cc: Rob Herring Cc: Mark Rutland Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- Documentation/devicetree/bindings/scsi/hisilicon-sas.txt | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt b/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt index df3bef7..8c6659e 100644 --- a/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt +++ b/Documentation/devicetree/bindings/scsi/hisilicon-sas.txt @@ -53,6 +53,13 @@ Main node required properties: Optional main node properties: - hip06-sas-v2-quirk-amt : when set, indicates that the v2 controller has the "am-max-transmissions" limitation. + - hisilicon,signal-attenuation : array of 3 32-bit values, containing de-emphasis, + preshoot, and boost attenuation readings for the board. They + are used to describe the signal attenuation of the board. These + values' range is 7600 to 12400, and used to represent -24dB to + 24dB. + The formula is "y = (x-1)/1". For example, 10478 + means 4.78dB. Example: sas0: sas@c100 { -- 1.9.1
[PATCH v4 04/10] scsi: hisi_sas: fix the issue of setting linkrate register
From: Xiaofei TanIt is not right to set the register PROG_PHY_LINK_RATE while PHY is still enabled. So if we want to change PHY linkrate, we need to disable PHY before setting the register PROG_PHY_LINK_RATE, and then start-up PHY. This patch is to fix this issue. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 5 +++-- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 5 +++-- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 +++-- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c index 38bbda9..2eb8980 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c @@ -881,10 +881,11 @@ static void phy_set_linkrate_v1_hw(struct hisi_hba *hisi_hba, int phy_no, prog_phy_link_rate &= ~0xff; prog_phy_link_rate |= rate_mask; + disable_phy_v1_hw(hisi_hba, phy_no); + msleep(100); hisi_sas_phy_write32(hisi_hba, phy_no, PROG_PHY_LINK_RATE, prog_phy_link_rate); - - phy_hard_reset_v1_hw(hisi_hba, phy_no); + start_phy_v1_hw(hisi_hba, phy_no); } static int get_wideport_bitmap_v1_hw(struct hisi_hba *hisi_hba, int port_id) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index 67be346..bd1a48a 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -1611,10 +1611,11 @@ static void phy_set_linkrate_v2_hw(struct hisi_hba *hisi_hba, int phy_no, prog_phy_link_rate &= ~0xff; prog_phy_link_rate |= rate_mask; + disable_phy_v2_hw(hisi_hba, phy_no); + msleep(100); hisi_sas_phy_write32(hisi_hba, phy_no, PROG_PHY_LINK_RATE, prog_phy_link_rate); - - phy_hard_reset_v2_hw(hisi_hba, phy_no); + start_phy_v2_hw(hisi_hba, phy_no); } static int get_wideport_bitmap_v2_hw(struct hisi_hba *hisi_hba, int port_id) diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index 1ee95ab..8da9de7 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -1862,10 +1862,11 @@ static void phy_set_linkrate_v3_hw(struct hisi_hba *hisi_hba, int phy_no, prog_phy_link_rate &= ~0xff; prog_phy_link_rate |= rate_mask; + disable_phy_v3_hw(hisi_hba, phy_no); + msleep(100); hisi_sas_phy_write32(hisi_hba, phy_no, PROG_PHY_LINK_RATE, prog_phy_link_rate); - - phy_hard_reset_v3_hw(hisi_hba, phy_no); + start_phy_v3_hw(hisi_hba, phy_no); } static void interrupt_disable_v3_hw(struct hisi_hba *hisi_hba) -- 1.9.1
[PATCH v4 07/10] scsi: hisi_sas: fix return value of hisi_sas_task_prep()
From: Xiaofei TanIt is an implicit regulation that error code that function returned should be negative. But hisi_sas_task_prep() doesn't follow this. This may cause problems in the upper layer code. For example, in sas_expander.c of libsas, smp_execute_task_sg() may return the number of bytes of underrun. It will be conflicted with the scenaio lldd_execute_task() return an positive error code. This patch change the return value from SAS_PHY_DOWN to -ECOMM in hisi_sas_task_prep(). Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 88ad8d4..dff9723 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -316,7 +316,7 @@ static int hisi_sas_task_prep(struct sas_task *task, struct hisi_sas_dq */ if (device->dev_type != SAS_SATA_DEV) task->task_done(task); - return SAS_PHY_DOWN; + return -ECOMM; } if (DEV_IS_GONE(sas_dev)) { @@ -327,7 +327,7 @@ static int hisi_sas_task_prep(struct sas_task *task, struct hisi_sas_dq dev_info(dev, "task prep: device %016llx not ready\n", SAS_ADDR(device->sas_addr)); - return SAS_PHY_DOWN; + return -ECOMM; } port = to_hisi_sas_port(sas_port); @@ -337,7 +337,7 @@ static int hisi_sas_task_prep(struct sas_task *task, struct hisi_sas_dq "SATA/STP" : "SAS", device->port->id); - return SAS_PHY_DOWN; + return -ECOMM; } if (!sas_protocol_ata(task->task_proto)) { -- 1.9.1
[PATCH] scsi: scsi_transport_iscsi: use put_device() instead of kfree()
Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav--- drivers/scsi/scsi_transport_iscsi.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index f4b52b4..aacb7ab 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -221,8 +221,10 @@ struct iscsi_endpoint * ep->dev.class = _endpoint_class; dev_set_name(>dev, "ep-%llu", (unsigned long long) id); err = device_register(>dev); -if (err) -goto free_ep; + if (err) { + put_device(>dev); + return NULL; + } err = sysfs_create_group(>dev.kobj, _endpoint_group); if (err) @@ -235,10 +237,6 @@ struct iscsi_endpoint * unregister_dev: device_unregister(>dev); return NULL; - -free_ep: - kfree(ep); - return NULL; } EXPORT_SYMBOL_GPL(iscsi_create_endpoint); @@ -783,7 +781,7 @@ struct iscsi_iface * free_iface: put_device(iface->dev.parent); - kfree(iface); + put_device(>dev); return NULL; } EXPORT_SYMBOL_GPL(iscsi_create_iface); @@ -1260,7 +1258,7 @@ struct iscsi_bus_flash_session * return fnode_sess; free_fnode_sess: - kfree(fnode_sess); + put_device(_sess->dev); return NULL; } EXPORT_SYMBOL_GPL(iscsi_create_flashnode_sess); @@ -1308,7 +1306,7 @@ struct iscsi_bus_flash_conn * return fnode_conn; free_fnode_conn: - kfree(fnode_conn); + put_device(_conn->dev); return NULL; } EXPORT_SYMBOL_GPL(iscsi_create_flashnode_conn); @@ -2268,6 +2266,8 @@ struct iscsi_cls_conn * release_parent_ref: put_device(>dev); + put_device(>dev); + conn = NULL; free_conn: kfree(conn); return NULL; @@ -4420,8 +4420,10 @@ struct scsi_transport_template * priv->dev.class = _transport_class; dev_set_name(>dev, "%s", tt->name); err = device_register(>dev); - if (err) - goto free_priv; + if (err) { + put_device(>dev); + return NULL; + } err = sysfs_create_group(>dev.kobj, _transport_group); if (err) @@ -4456,9 +4458,6 @@ struct scsi_transport_template * unregister_dev: device_unregister(>dev); return NULL; -free_priv: - kfree(priv); - return NULL; } EXPORT_SYMBOL_GPL(iscsi_register_transport); -- 1.9.1
[PATCH] target: tcm_loop: use put_device() if device_register fail
if device_register() returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav--- drivers/target/loopback/tcm_loop.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/target/loopback/tcm_loop.c b/drivers/target/loopback/tcm_loop.c index 9cd4ffe..5dffc55 100644 --- a/drivers/target/loopback/tcm_loop.c +++ b/drivers/target/loopback/tcm_loop.c @@ -412,6 +412,7 @@ static int tcm_loop_setup_hba_bus(struct tcm_loop_hba *tl_hba, int tcm_loop_host ret = device_register(_hba->dev); if (ret) { pr_err("device_register() failed for tl_hba->dev: %d\n", ret); + put_device(_hba->dev); return -ENODEV; } -- 1.9.1
Re: dmesg flooded with "Very big device. Trying to use READ CAPACITY(16)"
Hello Martin Thanks for your answer. By flooded I mean that it continously fill the dmesg log with no interruption, check attached a log that I have just taken from my server Some more details on my setup. I have these 5 HDD, WD RED 8TB in an Orico 5 bay enclosure, running JMS567 USBtoSATA bridge and an internal SATA multiplexer This is connected to the USB 3.0 host port of my server, it is an Intel Atom So in total the array is 5x8TB and it is configured in BTRFS RAID5 mode. I have already reported this issue to the linux-btrfs mailing list, I got a feedback that the filesystem itself has nothing to do with this capacity check, and I should have reported this here. Bye 2018-03-07 3:45 GMT+01:00 Martin K. Petersen: > > Menion, > >> Operating big capacity HDD such 8TB with complex filesystems like >> BTRFS in RAID mode endup in dmesg get flooded by this log, due too >> many capacity checks (opaque to the filesystem itself) > > What's your definition of flooded? How many do you see? > > Also, what kind of controller are these disks attached to? The reason > you see these messages is that to the kernel it looks like a legacy disk > device that predates capacities in the TB range. The warnings are logged > because we're surprised to be going down this path based on what the > device has previously told us. > > -- > Martin K. Petersen Oracle Linux Engineering * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support:https://ubuntu.com/advantage 8 packages can be updated. 0 updates are security updates. Last login: Mon Mar 5 15:32:57 2018 from 10.8.0.10 menion@Menionubuntu:~$ dmesg [0.00] Linux version 4.15.5-041505-generic (kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3.2)) #201802221031 SMP Thu Feb 22 15:32:28 UTC 2018 [0.00] Command line: BOOT_IMAGE=/@/boot/vmlinuz-4.15.5-041505-generic root=UUID=6db4baf7-fda8-41ac-a6ad-1ca7b083430f ro rootflags=subvol=@ [0.00] KERNEL supported cpus: [0.00] Intel GenuineIntel [0.00] AMD AuthenticAMD [0.00] Centaur CentaurHauls [0.00] x86/fpu: x87 FPU will use FXSAVE [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0008efff] usable [0.00] BIOS-e820: [mem 0x0008f000-0x0008] ACPI NVS [0.00] BIOS-e820: [mem 0x0009-0x0009dfff] usable [0.00] BIOS-e820: [mem 0x0009e000-0x0009] reserved [0.00] BIOS-e820: [mem 0x0010-0x1fff] usable [0.00] BIOS-e820: [mem 0x2000-0x201f] reserved [0.00] BIOS-e820: [mem 0x2020-0x7b631fff] usable [0.00] BIOS-e820: [mem 0x7b632000-0x7b661fff] reserved [0.00] BIOS-e820: [mem 0x7b662000-0x7b685fff] usable [0.00] BIOS-e820: [mem 0x7b686000-0x7b76bfff] ACPI NVS [0.00] BIOS-e820: [mem 0x7b76c000-0x7ba20fff] reserved [0.00] BIOS-e820: [mem 0x7ba21000-0x7ba71fff] type 20 [0.00] BIOS-e820: [mem 0x7ba72000-0x7ba76fff] usable [0.00] BIOS-e820: [mem 0x7ba77000-0x7ba77fff] reserved [0.00] BIOS-e820: [mem 0x7ba78000-0x7ba7afff] usable [0.00] BIOS-e820: [mem 0x7ba7b000-0x7ba7bfff] reserved [0.00] BIOS-e820: [mem 0x7ba7c000-0x7bff] usable [0.00] BIOS-e820: [mem 0xe000-0xe3ff] reserved [0.00] BIOS-e820: [mem 0xfea0-0xfeaf] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xfed01000-0xfed01fff] reserved [0.00] BIOS-e820: [mem 0xfed03000-0xfed03fff] reserved [0.00] BIOS-e820: [mem 0xfed06000-0xfed06fff] reserved [0.00] BIOS-e820: [mem 0xfed08000-0xfed09fff] reserved [0.00] BIOS-e820: [mem 0xfed1c000-0xfed1cfff] reserved [0.00] BIOS-e820: [mem 0xfed8-0xfedb] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0xffc0-0x] reserved [0.00] NX (Execute Disable) protection: active [0.00] efi: EFI v2.40 by American Megatrends [0.00] efi: ESRT=0x7b66 ACPI=0x7b6ca000 ACPI 2.0=0x7b6ca000 SMBIOS=0x7ba1f198 [0.00] random: fast init done [0.00] SMBIOS 2.8 present. [0.00] DMI: AZW Z83 II/Cherry Trail CR, BIOS YB1007 08/17/2017 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x7c000 max_arch_pfn =
Re: [PATCH] scsi: ipr: Use dma_pool_zalloc()
On Wed, Mar 7, 2018 at 8:45 AM, Martin K. Petersenwrote: > > Brian, > >> Thanks! >> >> Acked-by: Brian King > > Not sure where this patch was sent but it's not in my mailbox, nor the > list archives. > > Souptick: Please resubmit to linux-scsi with Brian's Acked-by. Ok, I will resubmit. > > -- > Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 2/2] m68k/amiga - Zorro ESP: new zorro_esp.c
Hi Michael, On Wed, Mar 7, 2018 at 8:55 AM, Michael Schmitzwrote: > OK, in that case I'll need to work out something similar to the test for > optional SCSI function on the Blizzard 1230/1260 to find out what board > I have when dealing with the duplicate Fastlane/Blizzard1230II ID. > > Is the board base address as returned by zorro_resource_start() reliable > to distinguish between Zorro II and Zorro III boards? The board base address is assigned by AmigaOS based on the values in the Expansion ROM (mainly ExpansionRom.er_Type) on the board. More specifically, AmigaOS creates struct ConfigDev from struct ExpansionRom. So yes, it must be reliable. > Am 06.03.2018 um 20:43 schrieb Geert Uytterhoeven: >> On Tue, Mar 6, 2018 at 2:11 AM, Michael Schmitz wrote: >>> Index 1 should have been ZORRO_PROD_PHASE5_CYBERSTORM_MK_II, I've >>> corrected that in the meantime. >>> >>> Fastlane / Blizzard 1230_II distinction is something I an not quite >>> sure about - does the probe function get called twice if the device >>> table contains the same ID twice but with different driver_data >>> contents? >> >> No, the probe function gets called on the first match only. >> Cfr. drivers/zorro/zorro-driver.c:zorro_device_probe(). Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds