Re: [PATCH 09/11] target_core_alua: Use workqueue for ALUA transitioning
On 10/17/2013 12:06 AM, Nicholas A. Bellinger wrote: On Wed, 2013-10-16 at 09:20 +0200, Hannes Reinecke wrote: Use a workqueue for processing ALUA state transitions; this allows us to process implicit delay properly. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/target/target_core_alua.c | 174 +++--- include/target/target_core_base.h | 4 + 2 files changed, 128 insertions(+), 50 deletions(-) SNIP +static int core_alua_do_transition_tg_pt( +struct t10_alua_tg_pt_gp *tg_pt_gp, +int new_state, +int explicit) +{ +struct se_device *dev = tg_pt_gp-tg_pt_gp_dev; +DECLARE_COMPLETION_ONSTACK(wait); + +/* Nothing to be done here */ +if (atomic_read(tg_pt_gp-tg_pt_gp_alua_access_state) == new_state) +return 0; + +if (new_state == ALUA_ACCESS_STATE_TRANSITION) +return -EAGAIN; + +/* + * Flush any pending transitions + */ +if (!explicit tg_pt_gp-tg_pt_gp_implicit_trans_secs +atomic_read(tg_pt_gp-tg_pt_gp_alua_access_state) == +ALUA_ACCESS_STATE_TRANSITION) { +/* Just in case */ +tg_pt_gp-tg_pt_gp_alua_pending_state = new_state; +tg_pt_gp-tg_pt_gp_transition_complete = wait; +flush_delayed_work(tg_pt_gp-tg_pt_gp_transition_work); +wait_for_completion(wait); +tg_pt_gp-tg_pt_gp_transition_complete = NULL; +return 0; +} + +/* + * Save the old primary ALUA access state, and set the current state + * to ALUA_ACCESS_STATE_TRANSITION. + */ +tg_pt_gp-tg_pt_gp_alua_previous_state = +atomic_read(tg_pt_gp-tg_pt_gp_alua_access_state); +tg_pt_gp-tg_pt_gp_alua_pending_state = new_state; + +atomic_set(tg_pt_gp-tg_pt_gp_alua_access_state, +ALUA_ACCESS_STATE_TRANSITION); +tg_pt_gp-tg_pt_gp_alua_access_status = (explicit) ? +ALUA_STATUS_ALTERED_BY_EXPLICIT_STPG : +ALUA_STATUS_ALTERED_BY_IMPLICIT_ALUA; + +/* + * Check for the optional ALUA primary state transition delay + */ +if (tg_pt_gp-tg_pt_gp_trans_delay_msecs != 0) +msleep_interruptible(tg_pt_gp-tg_pt_gp_trans_delay_msecs); + +/* + * Take a reference for workqueue item + */ +spin_lock(dev-t10_alua.tg_pt_gps_lock); +atomic_inc(tg_pt_gp-tg_pt_gp_ref_cnt); +smp_mb__after_atomic_inc(); +spin_unlock(dev-t10_alua.tg_pt_gps_lock); + +if (!explicit tg_pt_gp-tg_pt_gp_implicit_trans_secs) { +unsigned long transition_tmo; + +transition_tmo = tg_pt_gp-tg_pt_gp_implicit_trans_secs * HZ; +queue_delayed_work(tg_pt_gp-tg_pt_gp_dev-tmr_wq, + tg_pt_gp-tg_pt_gp_transition_work, + transition_tmo); +} else { +tg_pt_gp-tg_pt_gp_transition_complete = wait; +queue_delayed_work(tg_pt_gp-tg_pt_gp_dev-tmr_wq, + tg_pt_gp-tg_pt_gp_transition_work, 0); +wait_for_completion(wait); +tg_pt_gp-tg_pt_gp_transition_complete = NULL; +} Mmm, the trans_delay_msecs delay seems a bit out of place now.. How about dropping it's usage with msleep_interruptible() above, and instead combining it with the delay passed into queue_delayed_work()..? Yeah, we could. Actually I was thinking of opening up the implicit transition time to be a general transitioning time, to be used for both implicit and explicit. Reasoning here is that one could kick off a userland tool once 'transitioning' mode has been requested. That tool would then go ahead and do whatever is required to switch paths around etc, and could write the final ALUA state into configfs. That would then terminate the workqueue and the system would continue with the new state. If the userland tool fails to execute in time the system would revert to the requested state as it does now. The only problem with that approach is that it's currently impossible to have an atomic implicit transition for several tpgs. You can to an atomic _explicit_ transition, as SET TARGET PORT GROUPS _can_ carry information about all target port groups. (Mind you, scsi_dh_alua doesn't use it that way, it only sends information for the active/optimized target port group). But you cannot achieve a similar operation via configfs; there you can only set one tpg at a time, and each of this will potentially trigger a move to transitioning. While this is not a violation of the spec, it is certainly confusing. I'd rather have a way to set the new state for _all_ tpgs at once and only then kick off the transitioning mechanism. Any ideas how this could be achieved? Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr.
Es ist Privat
-- Es ist Privat Ich bin George Daniels, ein Banker und Credit-System-Programmierer (HSBC Bank). Ich sah Ihre E-Mail-Adresse während des Surfens durch die Bank DTC Bildschirm in mein Büro gestern so beschloss ich, diese Chance zu nutzen sehr, Sie kennenzulernen. I glaube, wir sollten jede Gelegenheit nutzen, um einander besser kennen. Allerdings bin ich mit Ihnen Kontakt aufnehmen für offensichtliche Grund, die Sie verstehen. Ich schicke diese Mail nur wissen, ob diese E-Mail-Adresse ist OK, antworten Sie mir zurück, damit ich mehr Details zu euch senden. Ich habe eine sehr wichtige Sache, um mit Ihnen zu diskutieren, ich freue mich auf Empfangen Sie Ihre Antwort an georgedaniel...@yahoo.com.hk. Haben Sie einen schönen Tag. George Daniels -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] target_core_alua: Referrals infrastructure
On 10/17/2013 12:28 AM, Nicholas A. Bellinger wrote: On Wed, 2013-10-16 at 09:25 +0200, Hannes Reinecke wrote: Add infrastructure for referrals. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/target/target_core_alua.c | 151 ++ drivers/target/target_core_alua.h | 4 +- drivers/target/target_core_configfs.c | 12 ++- drivers/target/target_core_device.c | 2 + drivers/target/target_core_sbc.c | 5 +- drivers/target/target_core_spc.c | 20 + include/scsi/scsi.h | 1 + include/target/target_core_base.h | 18 8 files changed, 209 insertions(+), 4 deletions(-) diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c index 166bee6..8f66146 100644 --- a/drivers/target/target_core_alua.c +++ b/drivers/target/target_core_alua.c @@ -56,6 +56,75 @@ static LIST_HEAD(lu_gps_list); struct t10_alua_lu_gp *default_lu_gp; /* + * REPORT REFERRALS + * + * See sbc3r35 section 5.23 + */ +sense_reason_t +target_emulate_report_referrals(struct se_cmd *cmd) +{ +struct se_device *dev = cmd-se_dev; +struct t10_alua_lba_map *map; +struct t10_alua_lba_map_member *map_mem; +unsigned char *buf; +u32 rd_len = 0, off; + +if (cmd-data_length 4) { +pr_warn(REPORT REFERRALS allocation length %u too + small\n, cmd-data_length); +return TCM_INVALID_CDB_FIELD; +} + +buf = transport_kmap_data_sg(cmd); +if (!buf) +return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; + +off = 4; +spin_lock(dev-t10_alua.lba_map_lock); +if (list_empty(dev-t10_alua.lba_map_list)) { +spin_unlock(dev-t10_alua.lba_map_lock); +transport_kunmap_data_sg(cmd); + +return TCM_UNSUPPORTED_SCSI_OPCODE; +} + +list_for_each_entry(map, dev-t10_alua.lba_map_list, +lba_map_list) { +int desc_num = off + 3; +int pg_num; + +off += 4; +put_unaligned_be64(map-lba_map_first_lba, buf[off]); +off += 8; +put_unaligned_be64(map-lba_map_last_lba, buf[off]); +off += 8; +rd_len += 20; +pg_num = 0; +list_for_each_entry(map_mem, map-lba_map_mem_list, +lba_map_mem_list) { +buf[off++] = map_mem-lba_map_mem_alua_state 0x0f; +off++; +buf[off++] = (map_mem-lba_map_mem_alua_pg_id 8) 0xff; +buf[off++] = (map_mem-lba_map_mem_alua_pg_id 0xff); +rd_len += 4; +pg_num++; +} +buf[desc_num] = pg_num; +} +spin_unlock(dev-t10_alua.lba_map_lock); + For both of these list walks, there needs to be a check against offset vs. -data_length to know when the available payload length has been exhausted.. Right. Will be fixing it up. [ .. ] diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c index e39d442..282b5bb 100644 --- a/drivers/target/target_core_spc.c +++ b/drivers/target/target_core_spc.c @@ -476,6 +476,11 @@ spc_emulate_evpd_86(struct se_cmd *cmd, unsigned char *buf) /* If WriteCache emulation is enabled, set V_SUP */ if (spc_check_dev_wce(dev)) buf[6] = 0x01; +/* If an LBA map is present set R_SUP */ +spin_lock(cmd-se_dev-t10_alua.lba_map_lock); +if (!list_empty(dev-t10_alua.lba_map_list)) +buf[8] = 0x10; +spin_unlock(cmd-se_dev-t10_alua.lba_map_lock); return 0; } Is there ever a case where R_SUP should be reported, but lba_map_list is empty..? Not that I can see. If R_SUP is set it means the 'REPORT REFERRALS' is supported. And 'REPORT REFERRALS' without a map is pretty much pointless. How about a se_device attribute called 'emulate_referrals' to determine when to report R_SUP..? Otherwise, perhaps using the se_lun - se_port - sep_alua_tg_pt_gp_mem - tg_pt_gp provided bit for tg_pt_gp_alua_supported_states instead..? I was thinking about the very same thing, but then figured it was easier to equal R_SUP with !list_empty(lba_map_list) instead of having a separate flag. Or crawling indirections just to find the very same information ... @@ -627,6 +632,20 @@ spc_emulate_evpd_b2(struct se_cmd *cmd, unsigned char *buf) return 0; } +/* Referrals VPD page */ +static sense_reason_t +spc_emulate_evpd_b3(struct se_cmd *cmd, unsigned char *buf) +{ +struct se_device *dev = cmd-se_dev; + +buf[0] = dev-transport-get_device_type(dev); +buf[3] = 0x0c; +put_unaligned_be32(dev-t10_alua.lba_map_segment_size, buf[8]); +put_unaligned_be32(dev-t10_alua.lba_map_segment_size, buf[12]); + Typo.. Offset for byte 12 should be the lba_map_segment_multiplier.. Oops ... Will be
Re: [PATCH 2/2] target_core_alua: Referrals configfs integration
On 10/17/2013 02:36 AM, Nicholas A. Bellinger wrote: On Wed, 2013-10-16 at 09:25 +0200, Hannes Reinecke wrote: Referrals need an LBA map, which needs to be kept consistent across all target port groups. So instead of tying the map to the target port groups I've implemented a single attribute containing the entire map. Signed-off-by: Hannes Reinecke h...@suse.de --- drivers/target/target_core_alua.c | 101 +++ drivers/target/target_core_alua.h | 8 ++ drivers/target/target_core_configfs.c | 171 + drivers/target/target_core_device.c| 1 + drivers/target/target_core_transport.c | 28 +- 5 files changed, 308 insertions(+), 1 deletion(-) diff --git a/drivers/target/target_core_alua.c b/drivers/target/target_core_alua.c index 8f66146..9dd01ff 100644 --- a/drivers/target/target_core_alua.c +++ b/drivers/target/target_core_alua.c @@ -1340,6 +1340,107 @@ static int core_alua_set_tg_pt_secondary_state( return 0; } +struct t10_alua_lba_map * +core_alua_allocate_lba_map(struct list_head *list, + u64 first_lba, u64 last_lba) +{ +struct t10_alua_lba_map *lba_map; + +lba_map = kmem_cache_zalloc(t10_alua_lba_map_cache, GFP_KERNEL); +if (!lba_map) { +pr_err(Unable to allocate struct t10_alua_lba_map\n); +return ERR_PTR(-ENOMEM); +} +INIT_LIST_HEAD(lba_map-lba_map_mem_list); +lba_map-lba_map_first_lba = first_lba; +lba_map-lba_map_last_lba = last_lba; + +list_add_tail(lba_map-lba_map_list, list); +return lba_map; +} This list_add_tail needs to be protected, no..? No. The current usage is that we first construct the mapping in memory, and then switching maps in set_lba_map. This way we only need to protect the switch itself, not the map construction. + +int +core_alua_allocate_lba_map_mem(struct t10_alua_lba_map *lba_map, + int pg_id, int state) +{ +struct t10_alua_lba_map_member *lba_map_mem; + +list_for_each_entry(lba_map_mem, lba_map-lba_map_mem_list, +lba_map_mem_list) { +if (lba_map_mem-lba_map_mem_alua_pg_id == pg_id) { +pr_err(Duplicate pg_id %d in lba_map\n, pg_id); +return -EINVAL; +} +} + +lba_map_mem = kmem_cache_zalloc(t10_alua_lba_map_mem_cache, GFP_KERNEL); +if (!lba_map_mem) { +pr_err(Unable to allocate struct t10_alua_lba_map_mem\n); +return -ENOMEM; +} +lba_map_mem-lba_map_mem_alua_state = state; +lba_map_mem-lba_map_mem_alua_pg_id = pg_id; + +list_add_tail(lba_map_mem-lba_map_mem_list, + lba_map-lba_map_mem_list); +return 0; +} Ditto here.. See above. + +void +core_alua_free_lba_map(struct list_head *lba_list) +{ +struct t10_alua_lba_map *lba_map, *lba_map_tmp; +struct t10_alua_lba_map_member *lba_map_mem, *lba_map_mem_tmp; + +list_for_each_entry_safe(lba_map, lba_map_tmp, lba_list, + lba_map_list) { +list_for_each_entry_safe(lba_map_mem, lba_map_mem_tmp, + lba_map-lba_map_mem_list, + lba_map_mem_list) { +list_del(lba_map_mem-lba_map_mem_list); +kmem_cache_free(t10_alua_lba_map_mem_cache, +lba_map_mem); +} +list_del(lba_map-lba_map_list); +kmem_cache_free(t10_alua_lba_map_cache, lba_map); +} +} And here.. + +void +core_alua_set_lba_map(struct se_device *dev, struct list_head *lba_map_list, + int segment_size, int segment_mult) +{ +struct list_head old_lba_map_list; +struct t10_alua_tg_pt_gp *tg_pt_gp; +int activate = 0, supported; + +INIT_LIST_HEAD(old_lba_map_list); +spin_lock(dev-t10_alua.lba_map_lock); +dev-t10_alua.lba_map_segment_size = segment_size; +dev-t10_alua.lba_map_segment_multiplier = segment_mult; +list_splice_init(dev-t10_alua.lba_map_list, old_lba_map_list); +if (lba_map_list) { +list_splice_init(lba_map_list, dev-t10_alua.lba_map_list); +activate = 1; +} +spin_unlock(dev-t10_alua.lba_map_lock); +spin_lock(dev-t10_alua.tg_pt_gps_lock); +list_for_each_entry(tg_pt_gp, dev-t10_alua.tg_pt_gps_list, +tg_pt_gp_list) { + +if (!tg_pt_gp-tg_pt_gp_valid_id) +continue; +supported = tg_pt_gp-tg_pt_gp_alua_supported_states; +if (activate) +supported |= ALUA_LBD_SUP; +else +supported = ~ALUA_LBD_SUP; +tg_pt_gp-tg_pt_gp_alua_supported_states = supported; +} +spin_unlock(dev-t10_alua.tg_pt_gps_lock); +
[Bug 62971] Kernel 3.11.1 and higher does not boot on VMware VM
https://bugzilla.kernel.org/show_bug.cgi?id=62971 --- Comment #2 from Bojan Smojver bo...@rexursive.com --- BTW, as tested in the downstream bug, the patch from the link is the fix. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 62971] Kernel 3.11.1 and higher does not boot on VMware VM
https://bugzilla.kernel.org/show_bug.cgi?id=62971 Bojan Smojver bo...@rexursive.com changed: What|Removed |Added Regression|No |Yes -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: BusLogic: Fix an oops when intializing multimaster adapter
On Wed, Sep 25, 2013 at 10:45 AM, Khalid Aziz khalid.a...@oracle.com wrote: This fixes an oops caused by buslogic driver when initializing a BusLogic MultiMaster adapter. Initialization code used scope of a variable incorrectly which created a NULL pointer. Oops message is below: BUG: unable to handle kernel NULL pointer dereference at 000c IP: [c150c137] blogic_init_mm_probeinfo.isra.17+0x20a/0x583 *pde = Oops: 002 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.11.1.puz1 #1 Hardware name:/Canterwood, BIOS 6.00 PG 05/16/2003 task: f705 ti: f7054000 task.ti: f7054000 EIP: 0060:[c150c137] EFLAGS: 00010246 CPU:1 EIP is at blogic_init_mm_probeinfo.isra.17+0x20a/0x583 EAX: 0013 EBX: ECX: EDX: f8001000 ESI: f71cb800 EDI: f7388000 EBP: 7800 ESP: f7055c84 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 CR0: 8005003b CR2: 000c CR3: 0154f000 CR4: 07d0 Stack: 001c c11a59f6 f7055c98 8130 0003 0013 f8001000 0001 03d0 c14e3f84 f78803c8 f738c000 00e9 Call Trace: [c11a59f6] ? pci_get_subsys+0x33/0x38 [c150c4fb] ? blogic_init_probeinfo_list+0x4b/0x19e [c108d593] ? __alloc_pages_nodemask+0xe3/0x623 [c108d593] ? __alloc_pages_nodemask+0xe3/0x623 [c10fb99e] ? sysfs_link_sibling+0x61/0x8d [c10b0519] ? kmem_cache_alloc+0x8b/0xb5 [c150cce5] ? blogic_init+0xa1/0x10e8 [c10fc0a8] ? sysfs_add_one+0x10/0x9d [c10fc18a] ? sysfs_addrm_finish+0x12/0x85 [c10fca37] ? sysfs_do_create_link_sd+0x9d/0x1b4 [c117c272] ? blk_register_queue+0x69/0xb3 [c10fcb68] ? sysfs_create_link+0x1a/0x2c [c1181a07] ? add_disk+0x1a1/0x3c7 [c138737b] ? klist_next+0x60/0xc3 [c122cc3a] ? scsi_dh_detach+0x68/0x68 [c1213e36] ? bus_for_each_dev+0x51/0x61 [c1000356] ? do_one_initcall+0x22/0x12c [c10f3688] ? __proc_create+0x8c/0xba [c150cc44] ? blogic_setup+0x5f6/0x5f6 [c14e94aa] ? repair_env_string+0xf/0x4d [c14e949b] ? do_early_param+0x71/0x71 [c103efaa] ? parse_args+0x21f/0x33d [c14e9a54] ? kernel_init_freeable+0xdf/0x17d [c14e949b] ? do_early_param+0x71/0x71 [c1388b64] ? kernel_init+0x8/0xc0 [c139] ? ret_from_kernel_thread+0x6/0x28 [c1392227] ? ret_from_kernel_thread+0x1b/0x28 [c1388b5c] ? rest_init+0x6c/0x6c Code: 89 44 24 10 0f b6 44 24 3d 89 44 24 0c c7 44 24 08 00 00 00 00 c7 44 24 04 38 62 46 c1 c7 04 24 02 00 00 00 e8 78 13 d2 ff 31 db 89 6b 0c b0 20 89 ea ee c7 44 24 08 04 00 00 00 8d 44 24 4c 89 EIP: [c150c137] blogic_init_mm_probeinfo.isra.17+0x20a/0x583 SS:ESP 0068:f7055c84 CR2: 000c ---[ end trace 17f45f5196d40487 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009 Signed-off-by: Khalid Aziz khalid.a...@oracle.com Cc: sta...@vger.kernel.org # 3.11.x Cc: Khalid Aziz kha...@gonehiking.org Reported-by: Pierre Uszynski pie...@rahul.net Tested-by: Pierre Uszynski pie...@rahul.net We had a user report an issue starting VMWare guests using BusLogic with the 3.11 kernel in Fedora. They tested a kernel build based on 3.11.5 plus this patch and it fixes their issue. Details here: https://bugzilla.redhat.com/show_bug.cgi?id=1015558 You can add a Tested-by: Bojan Smojver bo...@rexursive.com if you'd like. josh --- drivers/scsi/BusLogic.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/BusLogic.c b/drivers/scsi/BusLogic.c index feab3a5..757eb07 100644 --- a/drivers/scsi/BusLogic.c +++ b/drivers/scsi/BusLogic.c @@ -696,7 +696,7 @@ static int __init blogic_init_mm_probeinfo(struct blogic_adapter *adapter) while ((pci_device = pci_get_device(PCI_VENDOR_ID_BUSLOGIC, PCI_DEVICE_ID_BUSLOGIC_MULTIMASTER, pci_device)) != NULL) { - struct blogic_adapter *adapter = adapter; + struct blogic_adapter *host_adapter = adapter; struct blogic_adapter_info adapter_info; enum blogic_isa_ioport mod_ioaddr_req; unsigned char bus; @@ -744,9 +744,9 @@ static int __init blogic_init_mm_probeinfo(struct blogic_adapter *adapter) known and enabled, note that the particular Standard ISA I/O Address should not be probed. */ - adapter-io_addr = io_addr; - blogic_intreset(adapter); - if (blogic_cmd(adapter, BLOGIC_INQ_PCI_INFO, NULL, 0, + host_adapter-io_addr = io_addr; + blogic_intreset(host_adapter); + if (blogic_cmd(host_adapter, BLOGIC_INQ_PCI_INFO, NULL, 0, adapter_info, sizeof(adapter_info)) == sizeof(adapter_info)) { if (adapter_info.isa_port 6) @@ -762,7 +762,7 @@
Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
On 10/16/2013 01:34 PM, sumit.sax...@lsi.com wrote: There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance-pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs- Hi Sumit, - I'm a bit confused here- there are two threads which might access the same array, but the problem is still there when the second thread accesses the array during the final memcpy, I have expected that you will add some locking, but maybe I'm missing something. - now the code zeroes the pd_list even when the (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true. This is I think new - is that intentional? Thanks, Tomas MR_EVT_PD_INSERTED MR_EVT_PD_REMOVED MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED MR_EVT_FOREIGN_CFG_IMPORTED At same time running sysPD IO will be accessing the same array instance-pd_list[], which is getting updated in AEN path, because of this IO may not get correct PD info from instance-pd_list[] array. Signed-off-by: Adam Radford adam.radf...@lsi.com Signed-off-by: Sumit Saxena sumit.sax...@lsi.com --- diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 0c73ba4..e9e543c 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -1531,6 +1531,7 @@ struct megasas_instance { struct megasas_register_set __iomem *reg_set; u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY]; struct megasas_pd_list pd_list[MEGASAS_MAX_PD]; + struct megasas_pd_list local_pd_list[MEGASAS_MAX_PD]; u8 ld_ids[MEGASAS_MAX_LD_IDS]; s8 init_id; diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index e62ff02..83ebc75 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance *instance) (le32_to_cpu(ci-count) (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) { - memset(instance-pd_list, 0, + memset(instance-local_pd_list, 0, MEGASAS_MAX_PD * sizeof(struct megasas_pd_list)); for (pd_index = 0; pd_index le32_to_cpu(ci-count); pd_index++) { - instance-pd_list[le16_to_cpu(pd_addr-deviceId)].tid = + instance-local_pd_list[le16_to_cpu(pd_addr-deviceId)].tid = le16_to_cpu(pd_addr-deviceId); - instance-pd_list[le16_to_cpu(pd_addr-deviceId)].driveType = + instance-local_pd_list[le16_to_cpu(pd_addr-deviceId)].driveType = pd_addr-scsiDevType; - instance-pd_list[le16_to_cpu(pd_addr-deviceId)].driveState= + instance-local_pd_list[le16_to_cpu(pd_addr-deviceId)].driveState = MR_PD_STATE_SYSTEM; pd_addr++; } } + memcpy(instance-pd_list, instance-local_pd_list, + sizeof(instance-pd_list)); pci_free_consistent(instance-pdev, MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST), ci, ci_h); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] scsi: Add 'eh_deadline' to limit SCSI EH runtime
On Wed, 2013-10-16 at 19:22 +, James Bottomley wrote: What about instead: static int scsi_host_eh_past_deadline(struct Scsi_Host *shost, int percent) { if (!shost-last_reset || !shost-eh_deadline) return 0; if (time_before(jiffies, shost-last_reset + shost-eh_deadline * percent/100)) return 0; return 1; } which allows us to have if (scsi_host_eh_past_deadline(shost, 50)) { in scsi_eh_abort_cmds() if (scsi_host_eh_past_deadline(shost, 66) { in scsi_eh_bus_device_reset() say 83 in target reset, and 100 in bus reset. Thus ensuring we at least get a crack at the reset chain? James Well, conceptually that seems like a good idea, since if there is limited time available it is probably wiser to spend it on higher-level recovery instead of timing out trying to deal with individual devices, but we didn't do any testing on how long the bus device reset/target reset/bus reset take. The host reset was about 10 seconds for lpfc, and the maximum time was (command timeout) + (eh deadline) + (host reset time). However... With this enhancement, the maximum time could be much longer if we attempt to e.g. perform a bus reset right before the eh_deadline expires, because drivers like lpfc iterate over the targets and send a target reset to each one (with a timeout). The original problem that prompted this change was that a target became inaccessible, and nothing the EH did was ever going to do anything except timeout, until the host reset was performed, at which point the FC login would fail and the HBA would start failing commands immediately instead of them timing out. I guess the main thing is that there should be some way to explain to people what value to use for eh_deadline in order for I/O to complete within a specified amount of time (e.g. before some other node in a cluster shoots us because we are unresponsive). -Ewan -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
-Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Thursday, October 17, 2013 7:35 PM To: Saxena, Sumit; linux-scsi@vger.kernel.org Cc: jbottom...@parallels.com; Desai, Kashyap; aradf...@gmail.com Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path On 10/16/2013 01:34 PM, sumit.sax...@lsi.com wrote: There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance-pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs- Hi Sumit, - I'm a bit confused here- there are two threads which might access the same array, but the problem is still there when the second thread accesses the array during the final memcpy, I have expected that you will add some locking, but maybe I'm missing something. - now the code zeroes the pd_list even when the (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true. This is I think new - is that intentional? Tomas, Having lock to synchronize this will be a good choice, but will need changes in multiple places. Without this patch: driver memsets instance-pd_list[] array to zero, same array will be accessed in sysPD IO path, that creates problem. To resolve this issue, we introduced a new array instance-local_pd_list[] array, which we will be filling from Firmware DMAed data and then finally memcpy that array to the instance-pd_list[]. Since instance-pd_list is accessed in IO path, then no problem in memset zero here(memset is on instance-local_pd_list). Final Memcpy operation is not saved with locking, reason is: instance-pd_list array is of type struct megasas_pd_list, which is of 32-bit, so single entry in instance-local_pd_list array will be copied in one CPU cycle, and with current MR FW design, it will not be a problem even if IO path (or any other thread) is accessing old instance-pd_list[]. so we are safe here in memcpy() here. Adding lock will add overhead in IO path, which could be avoided is main reason to resolve this issue with this fix. Thanks, Sumit Thanks, Tomas MR_EVT_PD_INSERTED MR_EVT_PD_REMOVED MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED MR_EVT_FOREIGN_CFG_IMPORTED At same time running sysPD IO will be accessing the same array instance-pd_list[], which is getting updated in AEN path, because of this IO may not get correct PD info from instance-pd_list[] array. Signed-off-by: Adam Radford adam.radf...@lsi.com Signed-off-by: Sumit Saxena sumit.sax...@lsi.com --- diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 0c73ba4..e9e543c 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -1531,6 +1531,7 @@ struct megasas_instance { struct megasas_register_set __iomem *reg_set; u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY]; struct megasas_pd_list pd_list[MEGASAS_MAX_PD]; +struct megasas_pd_list local_pd_list[MEGASAS_MAX_PD]; u8 ld_ids[MEGASAS_MAX_LD_IDS]; s8 init_id; diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index e62ff02..83ebc75 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance *instance) (le32_to_cpu(ci-count) (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) { -memset(instance-pd_list, 0, +memset(instance-local_pd_list, 0, MEGASAS_MAX_PD * sizeof(struct megasas_pd_list)); for (pd_index = 0; pd_index le32_to_cpu(ci-count); pd_index++) { -instance-pd_list[le16_to_cpu(pd_addr-deviceId)].tid = +instance-local_pd_list[le16_to_cpu(pd_addr- deviceId)].tid= le16_to_cpu(pd_addr-deviceId); -instance-pd_list[le16_to_cpu(pd_addr- deviceId)].driveType = +instance-local_pd_list[le16_to_cpu(pd_addr- deviceId)].driveType = pd_addr-scsiDevType; -instance-pd_list[le16_to_cpu(pd_addr- deviceId)].driveState = +instance-local_pd_list[le16_to_cpu(pd_addr- deviceId)].driveState = MR_PD_STATE_SYSTEM; pd_addr++; } } +memcpy(instance-pd_list, instance-local_pd_list, +sizeof(instance-pd_list)); pci_free_consistent(instance-pdev, MEGASAS_MAX_PD * sizeof(struct MR_PD_LIST), ci, ci_h); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org
Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
On 10/17/2013 05:10 PM, Saxena, Sumit wrote: -Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Thursday, October 17, 2013 7:35 PM To: Saxena, Sumit; linux-scsi@vger.kernel.org Cc: jbottom...@parallels.com; Desai, Kashyap; aradf...@gmail.com Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path On 10/16/2013 01:34 PM, sumit.sax...@lsi.com wrote: There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance-pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs- Hi Sumit, - I'm a bit confused here- there are two threads which might access the same array, but the problem is still there when the second thread accesses the array during the final memcpy, I have expected that you will add some locking, but maybe I'm missing something. - now the code zeroes the pd_list even when the (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true. This is I think new - is that intentional? Tomas, Having lock to synchronize this will be a good choice, but will need changes in multiple places. Without this patch: driver memsets instance-pd_list[] array to zero, same array will be accessed in sysPD IO path, that creates problem. To resolve this issue, we introduced a new array instance-local_pd_list[] array, which we will be filling from Firmware DMAed data and then finally memcpy that array to the instance-pd_list[]. Since instance-pd_list is accessed in IO path, then no problem in memset zero here(memset is on instance-local_pd_list). Final Memcpy operation is not saved with locking, reason is: instance-pd_list array is of type struct megasas_pd_list, which is of 32-bit, so single entry in instance-local_pd_list array will be copied in one CPU cycle, and with current MR FW design, it will not be a problem even if IO path (or any other thread) is accessing old instance-pd_list[]. so we are safe here in memcpy() here. Adding lock will add overhead in IO path, which could be avoided is main reason to resolve this issue with this fix. Thanks, what remains is my second question - now the code zeroes the pd_list even when the (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true. This is I think new - is that intentional? Thanks, Sumit Thanks, Tomas MR_EVT_PD_INSERTED MR_EVT_PD_REMOVED MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED MR_EVT_FOREIGN_CFG_IMPORTED At same time running sysPD IO will be accessing the same array instance-pd_list[], which is getting updated in AEN path, because of this IO may not get correct PD info from instance-pd_list[] array. Signed-off-by: Adam Radford adam.radf...@lsi.com Signed-off-by: Sumit Saxena sumit.sax...@lsi.com --- diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 0c73ba4..e9e543c 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -1531,6 +1531,7 @@ struct megasas_instance { struct megasas_register_set __iomem *reg_set; u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY]; struct megasas_pd_list pd_list[MEGASAS_MAX_PD]; + struct megasas_pd_list local_pd_list[MEGASAS_MAX_PD]; u8 ld_ids[MEGASAS_MAX_LD_IDS]; s8 init_id; diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index e62ff02..83ebc75 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance *instance) (le32_to_cpu(ci-count) (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) { - memset(instance-pd_list, 0, + memset(instance-local_pd_list, 0, MEGASAS_MAX_PD * sizeof(struct megasas_pd_list)); for (pd_index = 0; pd_index le32_to_cpu(ci-count); pd_index++) { - instance-pd_list[le16_to_cpu(pd_addr-deviceId)].tid = + instance-local_pd_list[le16_to_cpu(pd_addr- deviceId)].tid = le16_to_cpu(pd_addr-deviceId); - instance-pd_list[le16_to_cpu(pd_addr- deviceId)].driveType= + instance-local_pd_list[le16_to_cpu(pd_addr- deviceId)].driveType= pd_addr-scsiDevType; - instance-pd_list[le16_to_cpu(pd_addr- deviceId)].driveState = + instance-local_pd_list[le16_to_cpu(pd_addr- deviceId)].driveState = MR_PD_STATE_SYSTEM; pd_addr++; } } + memcpy(instance-pd_list, instance-local_pd_list, +
Re: kernel cant access SATA adapter device
What is the version of mptsas driver? It seems that the mptsas's driver can't recognize the the type of SD card and treat it as normal hard disk drive. On Wed, Oct 16, 2013 at 03:46:51PM -0400, Robert Story wrote: Hi, I've got a SATA adapter for a SD card in a Dell Poweredge R610. The BIOS can see and boot the sdcard, (to Windows, or using syslinux/extlinux), but when it hands over control to the Linux kernel, the kernel cannot access it. Here's an excerpt from dmesg: scsi 0:0:2:0: Direct-Access ATA FC-1307 SD to CF 1.1 PQ: 0 ANSI: 5 sd 0:0:2:0: Attached scsi generic sg3 type 0 mptsas: ioc0: mptsas_free_fw_event: kfree (fw_event=0x8801ab1ad8c0) sd 0:0:2:0: [sdc] Spinning up disk .not responding... sd 0:0:2:0: [sdc] READ CAPACITY(16) failed sd 0:0:2:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sd 0:0:2:0: [sdc] Sense Key : Not Ready [current] sd 0:0:2:0: [sdc] Add. Sense: Logical unit not ready, initializing command required sd 0:0:2:0: [sdc] READ CAPACITY failed sd 0:0:2:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sd 0:0:2:0: [sdc] Sense Key : Not Ready [current] sd 0:0:2:0: [sdc] Add. Sense: Logical unit not ready, initializing command required sd 0:0:2:0: [sdc] Test WP failed, assume Write Enabled sd 0:0:2:0: [sdc] Asking for cache data failed sd 0:0:2:0: [sdc] Assuming drive cache: write through After booting and running 'sdparm --command=start /dev/sdc' I can get a read capacity to work, but cannot access the drive (e.g. fdisk -l /dev/sdc). If I put the SD card in a USB adapter, it works, and I was able to access the SD card via the SD adapter in another machine. But our production machines are dell's, so I really want to get it working there. I've attached more complete debug info from booting and existing CentOS 6.4 install with the card installed, and a rdsosreport from an attempt to boot Fedora 20 Alpha. Any help or suggestions greatly appreciated. Robert -- Senior Software Engineer Parsons Government Services , National Security Defense Division -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path
-Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Thursday, October 17, 2013 9:18 PM To: Saxena, Sumit; linux-scsi@vger.kernel.org Cc: jbottom...@parallels.com; Desai, Kashyap; aradf...@gmail.com Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path On 10/17/2013 05:10 PM, Saxena, Sumit wrote: -Original Message- From: Tomas Henzl [mailto:the...@redhat.com] Sent: Thursday, October 17, 2013 7:35 PM To: Saxena, Sumit; linux-scsi@vger.kernel.org Cc: jbottom...@parallels.com; Desai, Kashyap; aradf...@gmail.com Subject: Re: [PATCH][SCSI] megaraid_sas: Fix synchronization problem between sysPD IO path and AEN path On 10/16/2013 01:34 PM, sumit.sax...@lsi.com wrote: There is syncronization problem between sysPD IO path and AEN path. Driver maintains instance-pd_list[] array, which will get updated(by calling function megasas_get_pd_list[]), whenever any of below events occurs- Hi Sumit, - I'm a bit confused here- there are two threads which might access the same array, but the problem is still there when the second thread accesses the array during the final memcpy, I have expected that you will add some locking, but maybe I'm missing something. - now the code zeroes the pd_list even when the (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true. This is I think new - is that intentional? Tomas, Having lock to synchronize this will be a good choice, but will need changes in multiple places. Without this patch: driver memsets instance-pd_list[] array to zero, same array will be accessed in sysPD IO path, that creates problem. To resolve this issue, we introduced a new array instance- local_pd_list[] array, which we will be filling from Firmware DMAed data and then finally memcpy that array to the instance-pd_list[]. Since instance-pd_list is accessed in IO path, then no problem in memset zero here(memset is on instance-local_pd_list). Final Memcpy operation is not saved with locking, reason is: instance-pd_list array is of type struct megasas_pd_list, which is of 32-bit, so single entry in instance-local_pd_list array will be copied in one CPU cycle, and with current MR FW design, it will not be a problem even if IO path (or any other thread) is accessing old instance- pd_list[]. so we are safe here in memcpy() here. Adding lock will add overhead in IO path, which could be avoided is main reason to resolve this issue with this fix. Thanks, what remains is my second question - now the code zeroes the pd_list even when the (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true. This is I think new - is that intentional? Thanks for pointing this out, it's unintentional and memcpy() should be done only when (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is true, it did not cause problem because if (ret == 0 (ci-count (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) is not true, still driver will memcpy instance-local_pd_list to instance-pd_list, inspite of both arrays being same(extra overhead of memcpy, which is not needed). I will send out the updated patch. Thanks, Sumit Thanks, Tomas MR_EVT_PD_INSERTED MR_EVT_PD_REMOVED MR_EVT_CTRL_HOST_BUS_SCAN_REQUESTED MR_EVT_FOREIGN_CFG_IMPORTED At same time running sysPD IO will be accessing the same array instance-pd_list[], which is getting updated in AEN path, because of this IO may not get correct PD info from instance-pd_list[] array. Signed-off-by: Adam Radford adam.radf...@lsi.com Signed-off-by: Sumit Saxena sumit.sax...@lsi.com --- diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 0c73ba4..e9e543c 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -1531,6 +1531,7 @@ struct megasas_instance { struct megasas_register_set __iomem *reg_set; u32 *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY]; struct megasas_pd_list pd_list[MEGASAS_MAX_PD]; + struct megasas_pd_list local_pd_list[MEGASAS_MAX_PD]; u8 ld_ids[MEGASAS_MAX_LD_IDS]; s8 init_id; diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index e62ff02..83ebc75 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -3194,21 +3194,23 @@ megasas_get_pd_list(struct megasas_instance *instance) (le32_to_cpu(ci-count) (MEGASAS_MAX_PD_CHANNELS * MEGASAS_MAX_DEV_PER_CHANNEL))) { - memset(instance-pd_list, 0, + memset(instance-local_pd_list, 0, MEGASAS_MAX_PD * sizeof(struct megasas_pd_list)); for (pd_index = 0; pd_index le32_to_cpu(ci-count); pd_index++) { - instance-pd_list[le16_to_cpu(pd_addr-deviceId)].tid
Re: kernel cant access SATA adapter device
On Thu, 17 Oct 2013 23:59:24 +0800 taco wrote: T What is the version of mptsas driver? modinfo mptsas reports verion 3.04.20. Kernel is 2.6.32-358 (RHEL 6.3). T It seems that the mptsas's driver can't recognize the the type of SD T card and treat it as normal hard disk drive. Ok. So how would I figure out what mptsas is looking at to determine how to treat it? Or is there some kernel param I can pass that says sda=sdcard or soemthing? Robert -- Senior Software Engineer Parsons Government Services , National Security Defense Division signature.asc Description: PGP signature
Re: eSATA Drive Detection issues on mvsas
On Wed, Oct 16, 2013 at 10:28 AM, Praveen Murali pmur...@logicube.com wrote: diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index 161c98efade9..d0fb99d5da95 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -211,7 +211,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc) qc-tf.nsect = 0; } - ata_tf_to_fis(qc-tf, 1, 0, (u8*)task-ata_task.fis); + ata_tf_to_fis(qc-tf, qc-dev-link-pmp, 1, (u8*)task-ata_task.fis); task-uldd_task = qc; if (ata_is_atapi(qc-tf.protocol)) { memcpy(task-ata_task.atapi_packet, qc-cdb, qc-dev-cdb_len); Hi Dan, I tested this patch and it works great! Thanks! I'll send it up with your Reported-by and Tested-by. -- Dan -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RESEND v2 0/2] SATA disk resume time optimization
The essential issue behind hard disks' lengthy resume time is the ata port driver blocking until the ATA port hardware is finished coming online. So the kernel isn't really doing anything during all those seconds that the disks are resuming, it's just blocking until the hardware says it's ready to accept commands. Applying this patch set allows SATA disks to resume asynchronously without holding up system resume, thus allowing the UI to come online much more quickly. There may be a short period after resume where the disks are still spinning up in the background, but the user shouldn't notice since the OS can function with the data left in RAM. The patch set has two parts which apply to ata_port_resume and sd_resume respectively. Both are required to achieve any real performance benefit, but they will still function independantly without a performance hit. ata_port_resume patch (1/2): On resume, the ATA port driver currently waits until the AHCI controller finishes executing the port wakeup command. This patch changes the ata_port_resume callback to issue the wakeup and then return immediately, thus allowing the next device in the pm queue to resume. Any commands issued to the AHCI hardware during the wakeup will be queued up and executed once the port is physically online. Thus no information is lost. sd_resume patch (2/2): On resume, the SD driver currently waits until the block driver finishes executing a disk start command with blk_execute_rq. This patch changes the sd_resume callback to use blk_execute_rq_nowait instead, which allows it to return immediately, thus allowing the next device in the pm queue to resume. The return value of blk_execute_rq_nowait is handled in the background by sd_resume_complete. Any commands issued to the scsi disk during the startup will be queued up and executed once the disk is online. Thus no information is lost. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RESEND v2 2/2] SATA disk resume time optimization, asynchronous sd_resume
On resume, the SD driver currently waits until the block driver finishes executing a disk start command with blk_execute_rq. This patch changes the sd_resume callback to use blk_execute_rq_nowait instead, which allows it to return immediately, thus allowing the next device in the pm queue to resume. The return value of blk_execute_rq_nowait is handled in the background by sd_resume_complete. Any commands issued to the scsi disk during the startup will be queued up and executed once the disk is online. Thus no information is lost, and although the wait time itself isn't removed, it doesn't hold up the rest of the system. In combination with the ata_port_resume patch, this patch greatly reduces S3 system resume time on systems with SATA drives. This is accomplished by removing the drive spinup time from the system resume delay. Applying these two patches allows SATA disks to resume asynchronously without holding up system resume; thus allowing the UI to come online sooner. There may be a short period after resume where the disks are still spinning up in the background, but the user shouldn't notice since the OS can function with the data left in RAM. This patch applies to all three resume callbacks: resume, restore, and runtime-resume. There is only a performance benefit for resume, but for simplicity both restore and runtime-resume use the same code path. Signed-off-by: Todd Brandt todd.e.bra...@intel.com Signed-off-by: Arjan van de Ven ar...@linux.intel.com drivers/scsi/sd.c | 70 +++--- 1 file changed, 67 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e62d17d..bd8411f 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3117,18 +3117,82 @@ done: return ret; } +static void sd_resume_complete(struct request *rq, int error) +{ + struct scsi_sense_hdr sshdr; + struct scsi_disk *sdkp = rq-end_io_data; + char *sense = rq-sense; + + if (error) { + sd_printk(KERN_WARNING, sdkp, START FAILED\n); + sd_print_result(sdkp, error); + if (sense (driver_byte(error) DRIVER_SENSE)) { + scsi_normalize_sense(sense, + SCSI_SENSE_BUFFERSIZE, sshdr); + sd_print_sense_hdr(sdkp, sshdr); + } + } else { + sd_printk(KERN_NOTICE, sdkp, START SUCCESS\n); + } + + kfree(sense); + rq-sense = NULL; + rq-end_io_data = NULL; + __blk_put_request(rq-q, rq); + scsi_disk_put(sdkp); +} + static int sd_resume(struct device *dev) { + unsigned char cmd[6] = { START_STOP }; struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev); + struct request *req; + char *sense = NULL; int ret = 0; if (!sdkp-device-manage_start_stop) - goto done; + goto error; sd_printk(KERN_NOTICE, sdkp, Starting disk\n); - ret = sd_start_stop_device(sdkp, 1); -done: + cmd[4] |= 1; + + if (sdkp-device-start_stop_pwr_cond) + cmd[4] |= 1 4; /* Active or Standby */ + + if (!scsi_device_online(sdkp-device)) { + ret = -ENODEV; + goto error; + } + + req = blk_get_request(sdkp-device-request_queue, 0, __GFP_WAIT); + if (!req) { + ret = -ENOMEM; + goto error; + } + + sense = kzalloc(SCSI_SENSE_BUFFERSIZE, GFP_NOIO); + if (!sense) { + ret = -ENOMEM; + goto error_sense; + } + + req-cmd_len = COMMAND_SIZE(cmd[0]); + memcpy(req-cmd, cmd, req-cmd_len); + req-sense = sense; + req-sense_len = 0; + req-retries = SD_MAX_RETRIES; + req-timeout = SD_TIMEOUT; + req-cmd_type = REQ_TYPE_BLOCK_PC; + req-cmd_flags |= REQ_PM | REQ_QUIET | REQ_PREEMPT; + + req-end_io_data = sdkp; + blk_execute_rq_nowait(req-q, NULL, req, 1, sd_resume_complete); + return 0; + + error_sense: + __blk_put_request(req-q, req); + error: scsi_disk_put(sdkp); return ret; } -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] remove cpqarray from mainline kernel
On Thu, 17 Oct 2013 12:52:26 -0500 Mike Miller mike.mil...@hp.com wrote: cpqarray hasn't been used in over 12 years. It's doubtful that anyone still uses the board. It's time the driver was removed from the mainline kernel. The only updates these days are minor and mostly done by people outside of HP. It's amazing the weird stuff people get up to. Perhaps we should disable it in config for a cycle or two, see if that flushes anyone out? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] remove cpqarray from mainline kernel
On 10/17/2013 07:52 PM, Mike Miller wrote: From: Mike Miller mike.mil...@hp.com cpqarray hasn't been used in over 12 years. It's doubtful that anyone still uses the board. It's time the driver was removed from the mainline kernel. The only updates these days are minor and mostly done by people outside of HP. Signed-off-by: Mike Miller mike.mil...@hp.com Finally. All thumbs up for this. Acked-by: Hannes Reinecke h...@suse.de Cheers, Hannes -- Dr. Hannes Reinecke zSeries Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html