Re: [PATCH v7 06/11] scsi: ufshpb: Region inactivation in host mode
On 2021-04-06 14:16, Avri Altman wrote: On 2021-04-06 13:20, Avri Altman wrote: >> > -static void __ufshpb_evict_region(struct ufshpb_lu *hpb, >> > - struct ufshpb_region *rgn) >> > +static int __ufshpb_evict_region(struct ufshpb_lu *hpb, >> > + struct ufshpb_region *rgn) >> > { >> > struct victim_select_info *lru_info; >> > struct ufshpb_subregion *srgn; >> > int srgn_idx; >> > >> > + lockdep_assert_held(>rgn_state_lock); >> > + >> > + if (hpb->is_hcm) { >> > + unsigned long flags; >> > + int ret; >> > + >> > + spin_unlock_irqrestore(>rgn_state_lock, flags); >> >> Never seen a usage like this... Here flags is used without being >> intialized. >> The flag is needed when spin_unlock_irqrestore -> >> local_irq_restore(flags) to >> restore the DAIF register (in terms of ARM). > OK. Hi Avri, Checked on my setup, this lead to compilation error. Will you fix it in next version? warning: variable 'flags' is uninitialized when used here [-Wuninitialized] Yeah - I will pass it to __ufshpb_evict_region and drop the lockdep_assert call. Please paste the sample code/change here so that I can move forward quickly. I don't want to block your testing - are there any other things you want me to change? Currently, no. I will try to review and test this series these days and post comments at once. Thanks, Can Guo. Thanks, Avri Thanks, Can Guo. > > Thanks, > Avri > >> >> Thanks, >> >> Can Guo. >> >> > + ret = ufshpb_issue_umap_single_req(hpb, rgn); >> > + spin_lock_irqsave(>rgn_state_lock, flags); >> > + if (ret) >> > + return ret; >> > + } >> > + >> > lru_info = >lru_info; >> > >> > dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", >> > rgn->rgn_idx); >> > @@ -1130,6 +1150,8 @@ static void __ufshpb_evict_region(struct >> > ufshpb_lu *hpb, >> > >> > for_each_sub_region(rgn, srgn_idx, srgn) >> > ufshpb_purge_active_subregion(hpb, srgn); >> > + >> > + return 0; >> > } >> > >> > static int ufshpb_evict_region(struct ufshpb_lu *hpb, struct >> > ufshpb_region *rgn) >> > @@ -1151,7 +1173,7 @@ static int ufshpb_evict_region(struct ufshpb_lu >> > *hpb, struct ufshpb_region *rgn) >> > goto out; >> > } >> > >> > - __ufshpb_evict_region(hpb, rgn); >> > + ret = __ufshpb_evict_region(hpb, rgn); >> > } >> > out: >> > spin_unlock_irqrestore(>rgn_state_lock, flags); >> > @@ -1285,7 +1307,9 @@ static int ufshpb_add_region(struct ufshpb_lu >> > *hpb, struct ufshpb_region *rgn) >> > "LRU full (%d), choose victim %d\n", >> > atomic_read(_info->active_cnt), >> > victim_rgn->rgn_idx); >> > - __ufshpb_evict_region(hpb, victim_rgn); >> > + ret = __ufshpb_evict_region(hpb, victim_rgn); >> > + if (ret) >> > + goto out; >> > } >> > >> > /* >> > @@ -1856,6 +1880,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); >> > ufshpb_sysfs_attr_show_func(rb_active_cnt); >> > ufshpb_sysfs_attr_show_func(rb_inactive_cnt); >> > ufshpb_sysfs_attr_show_func(map_req_cnt); >> > +ufshpb_sysfs_attr_show_func(umap_req_cnt); >> > >> > static struct attribute *hpb_dev_stat_attrs[] = { >> > _attr_hit_cnt.attr, >> > @@ -1864,6 +1889,7 @@ static struct attribute *hpb_dev_stat_attrs[] = { >> > _attr_rb_active_cnt.attr, >> > _attr_rb_inactive_cnt.attr, >> > _attr_map_req_cnt.attr, >> > + _attr_umap_req_cnt.attr, >> > NULL, >> > }; >> > >> > @@ -1988,6 +2014,7 @@ static void ufshpb_stat_init(struct ufshpb_lu >> > *hpb) >> > hpb->stats.rb_active_cnt = 0; >> > hpb->stats.rb_inactive_cnt = 0; >> > hpb->stats.map_req_cnt = 0; >> > + hpb->stats.umap_req_cnt = 0; >> > } >> > >> > static void ufshpb_param_init(struct ufshpb_lu *hpb) >> > diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h >> > index 87495e59fcf1..1ea58c17a4de 100644 >> > --- a/drivers/scsi/ufs/ufshpb.h >> > +++ b/drivers/scsi/ufs/ufshpb.h >> > @@ -191,6 +191,7 @@ struct ufshpb_stats { >> > u64 rb_inactive_cnt; >> > u64 map_req_cnt; >> > u64 pre_req_cnt; >> > + u64 umap_req_cnt; >> > }; >> > >> > struct ufshpb_lu {
Re: [PATCH 1/2] scsi: ufs: Introduce hba performance monitor sysfs nodes
On 2021-04-06 13:58, Daejun Park wrote: Hi Can Guo, Hi Daejun, On 2021-04-06 12:11, Daejun Park wrote: Hi Can Guo, +static ssize_t monitor_enable_store(struct device *dev, +struct device_attribute *attr, +const char *buf, size_t count) +{ +struct ufs_hba *hba = dev_get_drvdata(dev); +unsigned long value, flags; + +if (kstrtoul(buf, 0, )) +return -EINVAL; + +value = !!value; +spin_lock_irqsave(hba->host->host_lock, flags); +if (value == hba->monitor.enabled) +goto out_unlock; + +if (!value) { +memset(>monitor, 0, sizeof(hba->monitor)); +} else { +hba->monitor.enabled = true; +hba->monitor.enabled_ts = ktime_get(); How about setting lat_max to and lat_min to KTIME_MAX and 0? lat_min is already 0. What is the benefit of setting lat_max to KTIME_MAX? I think lat_sum should be 0 at this point. lat_sum is already 0 at this point, what is the problem? Sorry. I misunderstood about resetting monitor values. +} + +out_unlock: +spin_unlock_irqrestore(hba->host->host_lock, flags); +return count; +} +static void ufshcd_update_monitor(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) +{ +int dir = ufshcd_monitor_opcode2dir(*lrbp->cmd->cmnd); + +if (dir >= 0 && hba->monitor.nr_queued[dir] > 0) { +struct request *req = lrbp->cmd->request; +struct ufs_hba_monitor *m = >monitor; +ktime_t now, inc, lat; + +now = ktime_get(); How about using lrbp->compl_time_stamp instead of getting new value? I am expecting "now" keeps increasing and use it to update m->busy_start_s, but if I use lrbp->compl_time_stamp to do that, below line ktime_sub() may give me an unexpected value as lrbp->compl_time_stamp may be smaller than m->busy_start_ts, because the actual requests are not completed by the device in the exact same ordering as the bits set in hba->outstanding_tasks, but driver is completing them from bit 0 to bit 31 in ascending order. lrbp->compl_time_stamp is set just before calling ufshcd_update_monitor(). And I don't think it can be negative value, because ufshcd_send_command() and __ufshcd_transfer_req_compl() are protected by host lock. Yes, I replied u in another mail... I will use the compl_time_stamp in next version. And later I will add alloc_time_stamp and release_time_stamp to lrbp so that we can monitor the overall send/compl path, including hpb_prep() and hpb_rsp(). +inc = ktime_sub(now, m->busy_start_ts[dir]); +m->total_busy[dir] = ktime_add(m->total_busy[dir], inc); +m->nr_sec_rw[dir] += blk_rq_sectors(req); + +/* Update latencies */ +m->nr_req[dir]++; +lat = ktime_sub(now, lrbp->issue_time_stamp); +m->lat_sum[dir] += lat; +if (m->lat_max[dir] < lat || !m->lat_max[dir]) +m->lat_max[dir] = lat; +if (m->lat_min[dir] > lat || !m->lat_min[dir]) +m->lat_min[dir] = lat; This if statement can be shorted, by setting lat_max / lat_min as default value. I don't quite get it, can you show me the code sample? I think " || !m->lat_max[dir]" can be removed. if (m->lat_max[dir] < lat) m->lat_max[dir] = lat; if (m->lat_min[dir] > lat) m->lat_min[dir] = lat; From the beginning, lat_min is 0, without "!m->lat_min[dir]", m->lat_min will never be updated. Same for lat_max. Meanwhile, !m->lat_min/max will be hit only once in each round, which does not hurt. Thanks, Can Guo. Thanks, Daejun Thanks, Can Guo + +m->nr_queued[dir]--; +/* Push forward the busy start of monitor */ +m->busy_start_ts[dir] = now; +} +} Thanks, Daejun
Re: [PATCH v7 06/11] scsi: ufshpb: Region inactivation in host mode
On 2021-04-06 13:20, Avri Altman wrote: > -static void __ufshpb_evict_region(struct ufshpb_lu *hpb, > - struct ufshpb_region *rgn) > +static int __ufshpb_evict_region(struct ufshpb_lu *hpb, > + struct ufshpb_region *rgn) > { > struct victim_select_info *lru_info; > struct ufshpb_subregion *srgn; > int srgn_idx; > > + lockdep_assert_held(>rgn_state_lock); > + > + if (hpb->is_hcm) { > + unsigned long flags; > + int ret; > + > + spin_unlock_irqrestore(>rgn_state_lock, flags); Never seen a usage like this... Here flags is used without being intialized. The flag is needed when spin_unlock_irqrestore -> local_irq_restore(flags) to restore the DAIF register (in terms of ARM). OK. Hi Avri, Checked on my setup, this lead to compilation error. Will you fix it in next version? warning: variable 'flags' is uninitialized when used here [-Wuninitialized] Thanks, Can Guo. Thanks, Avri Thanks, Can Guo. > + ret = ufshpb_issue_umap_single_req(hpb, rgn); > + spin_lock_irqsave(>rgn_state_lock, flags); > + if (ret) > + return ret; > + } > + > lru_info = >lru_info; > > dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", > rgn->rgn_idx); > @@ -1130,6 +1150,8 @@ static void __ufshpb_evict_region(struct > ufshpb_lu *hpb, > > for_each_sub_region(rgn, srgn_idx, srgn) > ufshpb_purge_active_subregion(hpb, srgn); > + > + return 0; > } > > static int ufshpb_evict_region(struct ufshpb_lu *hpb, struct > ufshpb_region *rgn) > @@ -1151,7 +1173,7 @@ static int ufshpb_evict_region(struct ufshpb_lu > *hpb, struct ufshpb_region *rgn) > goto out; > } > > - __ufshpb_evict_region(hpb, rgn); > + ret = __ufshpb_evict_region(hpb, rgn); > } > out: > spin_unlock_irqrestore(>rgn_state_lock, flags); > @@ -1285,7 +1307,9 @@ static int ufshpb_add_region(struct ufshpb_lu > *hpb, struct ufshpb_region *rgn) > "LRU full (%d), choose victim %d\n", > atomic_read(_info->active_cnt), > victim_rgn->rgn_idx); > - __ufshpb_evict_region(hpb, victim_rgn); > + ret = __ufshpb_evict_region(hpb, victim_rgn); > + if (ret) > + goto out; > } > > /* > @@ -1856,6 +1880,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); > ufshpb_sysfs_attr_show_func(rb_active_cnt); > ufshpb_sysfs_attr_show_func(rb_inactive_cnt); > ufshpb_sysfs_attr_show_func(map_req_cnt); > +ufshpb_sysfs_attr_show_func(umap_req_cnt); > > static struct attribute *hpb_dev_stat_attrs[] = { > _attr_hit_cnt.attr, > @@ -1864,6 +1889,7 @@ static struct attribute *hpb_dev_stat_attrs[] = { > _attr_rb_active_cnt.attr, > _attr_rb_inactive_cnt.attr, > _attr_map_req_cnt.attr, > + _attr_umap_req_cnt.attr, > NULL, > }; > > @@ -1988,6 +2014,7 @@ static void ufshpb_stat_init(struct ufshpb_lu > *hpb) > hpb->stats.rb_active_cnt = 0; > hpb->stats.rb_inactive_cnt = 0; > hpb->stats.map_req_cnt = 0; > + hpb->stats.umap_req_cnt = 0; > } > > static void ufshpb_param_init(struct ufshpb_lu *hpb) > diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h > index 87495e59fcf1..1ea58c17a4de 100644 > --- a/drivers/scsi/ufs/ufshpb.h > +++ b/drivers/scsi/ufs/ufshpb.h > @@ -191,6 +191,7 @@ struct ufshpb_stats { > u64 rb_inactive_cnt; > u64 map_req_cnt; > u64 pre_req_cnt; > + u64 umap_req_cnt; > }; > > struct ufshpb_lu {
Re: [PATCH 1/2] scsi: ufs: Introduce hba performance monitor sysfs nodes
On 2021-04-06 13:37, Can Guo wrote: Hi Daejun, On 2021-04-06 12:11, Daejun Park wrote: Hi Can Guo, +static ssize_t monitor_enable_store(struct device *dev, +struct device_attribute *attr, +const char *buf, size_t count) +{ +struct ufs_hba *hba = dev_get_drvdata(dev); +unsigned long value, flags; + +if (kstrtoul(buf, 0, )) +return -EINVAL; + +value = !!value; +spin_lock_irqsave(hba->host->host_lock, flags); +if (value == hba->monitor.enabled) +goto out_unlock; + +if (!value) { +memset(>monitor, 0, sizeof(hba->monitor)); +} else { +hba->monitor.enabled = true; +hba->monitor.enabled_ts = ktime_get(); How about setting lat_max to and lat_min to KTIME_MAX and 0? lat_min is already 0. What is the benefit of setting lat_max to KTIME_MAX? I think lat_sum should be 0 at this point. lat_sum is already 0 at this point, what is the problem? +} + +out_unlock: +spin_unlock_irqrestore(hba->host->host_lock, flags); +return count; +} +static void ufshcd_update_monitor(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) +{ +int dir = ufshcd_monitor_opcode2dir(*lrbp->cmd->cmnd); + +if (dir >= 0 && hba->monitor.nr_queued[dir] > 0) { +struct request *req = lrbp->cmd->request; +struct ufs_hba_monitor *m = >monitor; +ktime_t now, inc, lat; + +now = ktime_get(); How about using lrbp->compl_time_stamp instead of getting new value? I am expecting "now" keeps increasing and use it to update m->busy_start_s, but if I use lrbp->compl_time_stamp to do that, below line ktime_sub() may give me an unexpected value as lrbp->compl_time_stamp may be smaller than m->busy_start_ts, because the actual requests are not completed by the device in the exact same ordering as the bits set in hba->outstanding_tasks, but driver is completing them from bit 0 to bit 31 in ascending order. Sorry, I missunderstood your point... Yes, we can use lrbp->compl_time_stamp. Thanks, Can Guo. +inc = ktime_sub(now, m->busy_start_ts[dir]); +m->total_busy[dir] = ktime_add(m->total_busy[dir], inc); +m->nr_sec_rw[dir] += blk_rq_sectors(req); + +/* Update latencies */ +m->nr_req[dir]++; +lat = ktime_sub(now, lrbp->issue_time_stamp); +m->lat_sum[dir] += lat; +if (m->lat_max[dir] < lat || !m->lat_max[dir]) +m->lat_max[dir] = lat; +if (m->lat_min[dir] > lat || !m->lat_min[dir]) +m->lat_min[dir] = lat; This if statement can be shorted, by setting lat_max / lat_min as default value. I don't quite get it, can you show me the code sample? Thanks, Can Guo + +m->nr_queued[dir]--; +/* Push forward the busy start of monitor */ +m->busy_start_ts[dir] = now; +} +} Thanks, Daejun
Re: [PATCH 1/2] scsi: ufs: Introduce hba performance monitor sysfs nodes
Hi Daejun, On 2021-04-06 12:11, Daejun Park wrote: Hi Can Guo, +static ssize_t monitor_enable_store(struct device *dev, +struct device_attribute *attr, +const char *buf, size_t count) +{ +struct ufs_hba *hba = dev_get_drvdata(dev); +unsigned long value, flags; + +if (kstrtoul(buf, 0, )) +return -EINVAL; + +value = !!value; +spin_lock_irqsave(hba->host->host_lock, flags); +if (value == hba->monitor.enabled) +goto out_unlock; + +if (!value) { +memset(>monitor, 0, sizeof(hba->monitor)); +} else { +hba->monitor.enabled = true; +hba->monitor.enabled_ts = ktime_get(); How about setting lat_max to and lat_min to KTIME_MAX and 0? lat_min is already 0. What is the benefit of setting lat_max to KTIME_MAX? I think lat_sum should be 0 at this point. lat_sum is already 0 at this point, what is the problem? +} + +out_unlock: +spin_unlock_irqrestore(hba->host->host_lock, flags); +return count; +} +static void ufshcd_update_monitor(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) +{ +int dir = ufshcd_monitor_opcode2dir(*lrbp->cmd->cmnd); + +if (dir >= 0 && hba->monitor.nr_queued[dir] > 0) { +struct request *req = lrbp->cmd->request; +struct ufs_hba_monitor *m = >monitor; +ktime_t now, inc, lat; + +now = ktime_get(); How about using lrbp->compl_time_stamp instead of getting new value? I am expecting "now" keeps increasing and use it to update m->busy_start_s, but if I use lrbp->compl_time_stamp to do that, below line ktime_sub() may give me an unexpected value as lrbp->compl_time_stamp may be smaller than m->busy_start_ts, because the actual requests are not completed by the device in the exact same ordering as the bits set in hba->outstanding_tasks, but driver is completing them from bit 0 to bit 31 in ascending order. +inc = ktime_sub(now, m->busy_start_ts[dir]); +m->total_busy[dir] = ktime_add(m->total_busy[dir], inc); +m->nr_sec_rw[dir] += blk_rq_sectors(req); + +/* Update latencies */ +m->nr_req[dir]++; +lat = ktime_sub(now, lrbp->issue_time_stamp); +m->lat_sum[dir] += lat; +if (m->lat_max[dir] < lat || !m->lat_max[dir]) +m->lat_max[dir] = lat; +if (m->lat_min[dir] > lat || !m->lat_min[dir]) +m->lat_min[dir] = lat; This if statement can be shorted, by setting lat_max / lat_min as default value. I don't quite get it, can you show me the code sample? Thanks, Can Guo + +m->nr_queued[dir]--; +/* Push forward the busy start of monitor */ +m->busy_start_ts[dir] = now; +} +} Thanks, Daejun
Re: [PATCH v7 06/11] scsi: ufshpb: Region inactivation in host mode
On 2021-03-31 15:39, Avri Altman wrote: In host mode, the host is expected to send HPB-WRITE-BUFFER with buffer-id = 0x1 when it inactivates a region. Use the map-requests pool as there is no point in assigning a designated cache for umap-requests. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 35 +++ drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 32 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index aefb6dc160ee..fcc954f51bcf 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -914,6 +914,7 @@ static int ufshpb_execute_umap_req(struct ufshpb_lu *hpb, blk_execute_rq_nowait(NULL, req, 1, ufshpb_umap_req_compl_fn); + hpb->stats.umap_req_cnt++; return 0; } @@ -1110,18 +,37 @@ static int ufshpb_issue_umap_req(struct ufshpb_lu *hpb, return -EAGAIN; } +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, + struct ufshpb_region *rgn) +{ + return ufshpb_issue_umap_req(hpb, rgn); +} + static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) { return ufshpb_issue_umap_req(hpb, NULL); } -static void __ufshpb_evict_region(struct ufshpb_lu *hpb, - struct ufshpb_region *rgn) +static int __ufshpb_evict_region(struct ufshpb_lu *hpb, +struct ufshpb_region *rgn) { struct victim_select_info *lru_info; struct ufshpb_subregion *srgn; int srgn_idx; + lockdep_assert_held(>rgn_state_lock); + + if (hpb->is_hcm) { + unsigned long flags; + int ret; + + spin_unlock_irqrestore(>rgn_state_lock, flags); Never seen a usage like this... Here flags is used without being intialized. The flag is needed when spin_unlock_irqrestore -> local_irq_restore(flags) to restore the DAIF register (in terms of ARM). Thanks, Can Guo. + ret = ufshpb_issue_umap_single_req(hpb, rgn); + spin_lock_irqsave(>rgn_state_lock, flags); + if (ret) + return ret; + } + lru_info = >lru_info; dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", rgn->rgn_idx); @@ -1130,6 +1150,8 @@ static void __ufshpb_evict_region(struct ufshpb_lu *hpb, for_each_sub_region(rgn, srgn_idx, srgn) ufshpb_purge_active_subregion(hpb, srgn); + + return 0; } static int ufshpb_evict_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) @@ -1151,7 +1173,7 @@ static int ufshpb_evict_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) goto out; } - __ufshpb_evict_region(hpb, rgn); + ret = __ufshpb_evict_region(hpb, rgn); } out: spin_unlock_irqrestore(>rgn_state_lock, flags); @@ -1285,7 +1307,9 @@ static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) "LRU full (%d), choose victim %d\n", atomic_read(_info->active_cnt), victim_rgn->rgn_idx); - __ufshpb_evict_region(hpb, victim_rgn); + ret = __ufshpb_evict_region(hpb, victim_rgn); + if (ret) + goto out; } /* @@ -1856,6 +1880,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); ufshpb_sysfs_attr_show_func(rb_active_cnt); ufshpb_sysfs_attr_show_func(rb_inactive_cnt); ufshpb_sysfs_attr_show_func(map_req_cnt); +ufshpb_sysfs_attr_show_func(umap_req_cnt); static struct attribute *hpb_dev_stat_attrs[] = { _attr_hit_cnt.attr, @@ -1864,6 +1889,7 @@ static struct attribute *hpb_dev_stat_attrs[] = { _attr_rb_active_cnt.attr, _attr_rb_inactive_cnt.attr, _attr_map_req_cnt.attr, + _attr_umap_req_cnt.attr, NULL, }; @@ -1988,6 +2014,7 @@ static void ufshpb_stat_init(struct ufshpb_lu *hpb) hpb->stats.rb_active_cnt = 0; hpb->stats.rb_inactive_cnt = 0; hpb->stats.map_req_cnt = 0; + hpb->stats.umap_req_cnt = 0; } static void ufshpb_param_init(struct ufshpb_lu *hpb) diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 87495e59fcf1..1ea58c17a4de 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -191,6 +191,7 @@ struct ufshpb_stats { u64 rb_inactive_cnt; u64 map_req_cnt; u64 pre_req_cnt; + u64 umap_req_cnt; }; struct ufshpb_lu {
[PATCH v5 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs
In __ufshcd_issue_tm_cmd(), it is not right to use hba->nutrs + req->tag as the Task Tag in one TMR UPIU. Directly use req->tag as the Task Tag. Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command implementation") Reviewed-by: Bart Van Assche Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 30 +- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index d4f8cb2..ce5f3fea 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6446,38 +6446,34 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, DECLARE_COMPLETION_ONSTACK(wait); struct request *req; unsigned long flags; - int free_slot, task_tag, err; + int task_tag, err; /* -* Get free slot, sleep if slots are unavailable. -* Even though we use wait_event() which sleeps indefinitely, -* the maximum wait time is bounded by %TM_CMD_TIMEOUT. +* blk_get_request() is used here only to get a free tag. */ req = blk_get_request(q, REQ_OP_DRV_OUT, 0); if (IS_ERR(req)) return PTR_ERR(req); req->end_io_data = - free_slot = req->tag; - WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); ufshcd_hold(hba, false); spin_lock_irqsave(host->host_lock, flags); - task_tag = hba->nutrs + free_slot; blk_mq_start_request(req); + task_tag = req->tag; treq->req_header.dword_0 |= cpu_to_be32(task_tag); - memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq)); - ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function); + memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq)); + ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function); /* send command to the controller */ - __set_bit(free_slot, >outstanding_tasks); + __set_bit(task_tag, >outstanding_tasks); /* Make sure descriptors are ready before ringing the task doorbell */ wmb(); - ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL); + ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL); /* Make sure that doorbell is committed immediately */ wmb(); @@ -6497,24 +6493,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR); dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n", __func__, tm_function); - if (ufshcd_clear_tm_cmd(hba, free_slot)) - dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) after timeout\n", - __func__, free_slot); + if (ufshcd_clear_tm_cmd(hba, task_tag)) + dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot %d) after timeout\n", + __func__, task_tag); err = -ETIMEDOUT; } else { err = 0; - memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq)); + memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq)); ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_COMP); } spin_lock_irqsave(hba->host->host_lock, flags); - __clear_bit(free_slot, >outstanding_tasks); + __clear_bit(task_tag, >outstanding_tasks); spin_unlock_irqrestore(hba->host->host_lock, flags); + ufshcd_release(hba); blk_put_request(req); - ufshcd_release(hba); return err; } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH v5 1/2] scsi: ufs: Fix task management request completion timeout
ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()), but since blk_mq_tagset_busy_iter() only iterates over all reserved tags and requests which are not in IDLE state, ufshcd_compl_tm() never gets a chance to run. Thus, TMR always ends up with completion timeout. Fix it by calling blk_mq_start_request() in __ufshcd_issue_tm_cmd(). Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b49555fa..d4f8cb2 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, spin_lock_irqsave(host->host_lock, flags); task_tag = hba->nutrs + free_slot; + blk_mq_start_request(req); treq->req_header.dword_0 |= cpu_to_be32(task_tag); -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v4 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs
On 2021-04-01 14:44, Daejun Park wrote: Hi, Can Guo diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c ... req->end_io_data = - free_slot = req->tag; WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); I think this line should be removed. Oh, yes, will remove it in next version. Thanks, Can Guo. Thanks, Daejun
[PATCH 2/2] scsi: ufs: Add support for hba performance monitor
Add a new sysfs group which has nodes to monitor data/request transfer performance. This sysfs group has nodes showing total sectors/requests transferred, total busy time spent and max/min/avg/sum latencies. Signed-off-by: Can Guo diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index d1bc23c..8380866 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -995,6 +995,132 @@ Description: This entry shows the target state of an UFS UIC link The file is read only. +What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_enable +Date: January 2021 +Contact: Can Guo +Description: This file shows the status of performance monitor enablement + and it can be used to start/stop the monitor. When the monitor + is stopped, the performance data collected is also cleared. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_chunk_size +Date: January 2021 +Contact: Can Guo +Description: This file tells the monitor to focus on requests transferring + data of specific chunk size (in Bytes). 0 means any chunk size. + It can only be changed when monitor is disabled. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_sectors +Date: January 2021 +Contact: Can Guo +Description: This file shows how many sectors (in 512 Bytes) have been + sent from device to host after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_busy +Date: January 2021 +Contact: Can Guo +Description: This file shows how long (in micro seconds) has been spent + sending data from device to host after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_nr_requests +Date: January 2021 +Contact: Can Guo +Description: This file shows how many read requests have been sent after + monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_max +Date: January 2021 +Contact: Can Guo +Description: This file shows the maximum latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_min +Date: January 2021 +Contact: Can Guo +Description: This file shows the minimum latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_avg +Date: January 2021 +Contact: Can Guo +Description: This file shows the average latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_sum +Date: January 2021 +Contact: Can Guo +Description: This file shows the total latency (in micro seconds) of + read requests sent after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_sectors +Date: January 2021 +Contact: Can Guo +Description: This file shows how many sectors (in 512 Bytes) have been sent + from host to device after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_busy +Date: January 2021 +Contact: Can Guo +Description: This file shows how long (in micro seconds) has been spent + sending data from host to device after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_nr_requests +Date: January 2021 +Contact: Can Guo +Description: This file shows how many write requests have been sent after + monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_max +Date: January 2021 +Contact: Can Guo +Description: This file shows the maximum latency (in micro seconds) of write + requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_min +Date: January 2021 +Contact: Can Guo +Description: This file shows the minimum latency (in micro seconds) of write + requests after monitor gets started
[PATCH 1/2] scsi: ufs: Introduce hba performance monitor sysfs nodes
Add a new sysfs group which has nodes to monitor data/request transfer performance. This sysfs group has nodes showing total sectors/requests transferred, total busy time spent and max/min/avg/sum latencies. This group can be enhanced later to show more UFS driver layer performance statistics data during runtime. Signed-off-by: Can Guo diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index acc54f5..348df0e 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -278,6 +278,242 @@ static const struct attribute_group ufs_sysfs_default_group = { .attrs = ufs_sysfs_ufshcd_attrs, }; +static ssize_t monitor_enable_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%d\n", hba->monitor.enabled); +} + +static ssize_t monitor_enable_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long value, flags; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + value = !!value; + spin_lock_irqsave(hba->host->host_lock, flags); + if (value == hba->monitor.enabled) + goto out_unlock; + + if (!value) { + memset(>monitor, 0, sizeof(hba->monitor)); + } else { + hba->monitor.enabled = true; + hba->monitor.enabled_ts = ktime_get(); + } + +out_unlock: + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t monitor_chunk_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.chunk_size); +} + +static ssize_t monitor_chunk_size_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long value, flags; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + spin_lock_irqsave(hba->host->host_lock, flags); + /* Only allow chunk size change when monitor is disabled */ + if (!hba->monitor.enabled) + hba->monitor.chunk_size = value; + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t read_total_sectors_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.nr_sec_rw[READ]); +} + +static ssize_t read_total_busy_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.total_busy[READ])); +} + +static ssize_t read_nr_requests_show(struct device *dev, +struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.nr_req[READ]); +} + +static ssize_t read_req_latency_avg_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + struct ufs_hba_monitor *m = >monitor; + + return sysfs_emit(buf, "%llu\n", div_u64(ktime_to_us(m->lat_sum[READ]), +m->nr_req[READ])); +} + +static ssize_t read_req_latency_max_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.lat_max[READ])); +} + +static ssize_t read_req_latency_min_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.lat_min[READ])); +} + +static ssize_t read_req_latency_sum_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ +
[PATCH 2/2] scsi: ufs: Add support for hba performance monitor
Add a new sysfs group which has nodes to monitor data/request transfer performance. This sysfs group has nodes showing total sectors/requests transferred, total busy time spent and max/min/avg/sum latencies. Signed-off-by: Can Guo diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index d1bc23c..8380866 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -995,6 +995,132 @@ Description: This entry shows the target state of an UFS UIC link The file is read only. +What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_enable +Date: January 2021 +Contact: Can Guo +Description: This file shows the status of performance monitor enablement + and it can be used to start/stop the monitor. When the monitor + is stopped, the performance data collected is also cleared. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_chunk_size +Date: January 2021 +Contact: Can Guo +Description: This file tells the monitor to focus on requests transferring + data of specific chunk size (in Bytes). 0 means any chunk size. + It can only be changed when monitor is disabled. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_sectors +Date: January 2021 +Contact: Can Guo +Description: This file shows how many sectors (in 512 Bytes) have been + sent from device to host after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_busy +Date: January 2021 +Contact: Can Guo +Description: This file shows how long (in micro seconds) has been spent + sending data from device to host after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_nr_requests +Date: January 2021 +Contact: Can Guo +Description: This file shows how many read requests have been sent after + monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_max +Date: January 2021 +Contact: Can Guo +Description: This file shows the maximum latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_min +Date: January 2021 +Contact: Can Guo +Description: This file shows the minimum latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_avg +Date: January 2021 +Contact: Can Guo +Description: This file shows the average latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_sum +Date: January 2021 +Contact: Can Guo +Description: This file shows the total latency (in micro seconds) of + read requests sent after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_sectors +Date: January 2021 +Contact: Can Guo +Description: This file shows how many sectors (in 512 Bytes) have been sent + from host to device after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_busy +Date: January 2021 +Contact: Can Guo +Description: This file shows how long (in micro seconds) has been spent + sending data from host to device after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_nr_requests +Date: January 2021 +Contact: Can Guo +Description: This file shows how many write requests have been sent after + monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_max +Date: January 2021 +Contact: Can Guo +Description: This file shows the maximum latency (in micro seconds) of write + requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_min +Date: January 2021 +Contact: Can Guo +Description: This file shows the minimum latency (in micro seconds) of write + requests after monitor gets started
[PATCH 1/2] scsi: ufs: Introduce hba performance monitor sysfs nodes
Add a new sysfs group which has nodes to monitor data/request transfer performance. This sysfs group has nodes showing total sectors/requests transferred, total busy time spent and max/min/avg/sum latencies. This group can be enhanced later to show more UFS driver layer performance statistics data during runtime. Signed-off-by: Can Guo diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index acc54f5..1f93f3e 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -278,6 +278,242 @@ static const struct attribute_group ufs_sysfs_default_group = { .attrs = ufs_sysfs_ufshcd_attrs, }; +static ssize_t monitor_enable_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%d\n", hba->monitor.enabled); +} + +static ssize_t monitor_enable_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long value, flags; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + value = !!value; + spin_lock_irqsave(hba->host->host_lock, flags); + if (value == hba->monitor.enabled) + goto out_unlock; + + if (!value) { + memset(>monitor, 0, sizeof(hba->monitor)); + } else { + hba->monitor.enabled = true; + hba->monitor.enabled_ts = ktime_get(); + } + +out_unlock: + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t monitor_chunk_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.chunk_size); +} + +static ssize_t monitor_chunk_size_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long value, flags; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + spin_lock_irqsave(hba->host->host_lock, flags); + /* Only allow chunk size change when monitor is disabled */ + if (!hba->monitor.enabled) + hba->monitor.chunk_size = value; + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t read_total_sectors_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.nr_sec_rw[READ]); +} + +static ssize_t read_total_busy_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.total_busy[READ])); +} + +static ssize_t read_nr_requests_show(struct device *dev, +struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.nr_req[READ]); +} + +static ssize_t read_req_latency_avg_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + struct ufs_hba_monitor *m = >monitor; + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(m->lat_sum[READ]) / m->nr_req[READ]); +} + +static ssize_t read_req_latency_max_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.lat_max[READ])); +} + +static ssize_t read_req_latency_min_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.lat_min[READ])); +} + +static ssize_t read_req_latency_sum_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba
Re: [PATCH v4 1/2] scsi: ufs: Fix task management request completion timeout
On 2021-04-01 00:45, Avri Altman wrote: ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()), but since blk_mq_tagset_busy_iter() only iterates over all reserved tags and requests which are not in IDLE state, ufshcd_compl_tm() never gets a chance to run. Thus, TMR always ends up with completion timeout. Fix it by calling blk_mq_start_request() in __ufshcd_issue_tm_cmd(). Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b49555fa..d4f8cb2 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, spin_lock_irqsave(host->host_lock, flags); task_tag = hba->nutrs + free_slot; + blk_mq_start_request(req); Maybe just set req->state to MQ_RQ_IN_FLIGHT Without all other irrelevant initializations such as add timeout etc. I don't see any other drivers do that, is it appropriate to call WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT) outside block layer? Thanks, Can Guo. Thanks, Avri treq->req_header.dword_0 |= cpu_to_be32(task_tag); -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH v4 2/2] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs
In __ufshcd_issue_tm_cmd(), it is not right to use hba->nutrs + req->tag as the Task Tag in one TMR UPIU. Directly use req->tag as the Task Tag. Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command implementation") Reviewed-by: Bart Van Assche Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 29 + 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index d4f8cb2..cdd8c3d 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6446,38 +6446,35 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, DECLARE_COMPLETION_ONSTACK(wait); struct request *req; unsigned long flags; - int free_slot, task_tag, err; + int task_tag, err; /* -* Get free slot, sleep if slots are unavailable. -* Even though we use wait_event() which sleeps indefinitely, -* the maximum wait time is bounded by %TM_CMD_TIMEOUT. +* blk_get_request() is used here only to get a free tag. */ req = blk_get_request(q, REQ_OP_DRV_OUT, 0); if (IS_ERR(req)) return PTR_ERR(req); req->end_io_data = - free_slot = req->tag; WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs); ufshcd_hold(hba, false); spin_lock_irqsave(host->host_lock, flags); - task_tag = hba->nutrs + free_slot; blk_mq_start_request(req); + task_tag = req->tag; treq->req_header.dword_0 |= cpu_to_be32(task_tag); - memcpy(hba->utmrdl_base_addr + free_slot, treq, sizeof(*treq)); - ufshcd_vops_setup_task_mgmt(hba, free_slot, tm_function); + memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq)); + ufshcd_vops_setup_task_mgmt(hba, task_tag, tm_function); /* send command to the controller */ - __set_bit(free_slot, >outstanding_tasks); + __set_bit(task_tag, >outstanding_tasks); /* Make sure descriptors are ready before ringing the task doorbell */ wmb(); - ufshcd_writel(hba, 1 << free_slot, REG_UTP_TASK_REQ_DOOR_BELL); + ufshcd_writel(hba, 1 << task_tag, REG_UTP_TASK_REQ_DOOR_BELL); /* Make sure that doorbell is committed immediately */ wmb(); @@ -6497,24 +6494,24 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR); dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n", __func__, tm_function); - if (ufshcd_clear_tm_cmd(hba, free_slot)) - dev_WARN(hba->dev, "%s: unable clear tm cmd (slot %d) after timeout\n", - __func__, free_slot); + if (ufshcd_clear_tm_cmd(hba, task_tag)) + dev_WARN(hba->dev, "%s: unable to clear tm cmd (slot %d) after timeout\n", + __func__, task_tag); err = -ETIMEDOUT; } else { err = 0; - memcpy(treq, hba->utmrdl_base_addr + free_slot, sizeof(*treq)); + memcpy(treq, hba->utmrdl_base_addr + task_tag, sizeof(*treq)); ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_COMP); } spin_lock_irqsave(hba->host->host_lock, flags); - __clear_bit(free_slot, >outstanding_tasks); + __clear_bit(task_tag, >outstanding_tasks); spin_unlock_irqrestore(hba->host->host_lock, flags); + ufshcd_release(hba); blk_put_request(req); - ufshcd_release(hba); return err; } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH v4 1/2] scsi: ufs: Fix task management request completion timeout
ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()), but since blk_mq_tagset_busy_iter() only iterates over all reserved tags and requests which are not in IDLE state, ufshcd_compl_tm() never gets a chance to run. Thus, TMR always ends up with completion timeout. Fix it by calling blk_mq_start_request() in __ufshcd_issue_tm_cmd(). Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b49555fa..d4f8cb2 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6464,6 +6464,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, spin_lock_irqsave(host->host_lock, flags); task_tag = hba->nutrs + free_slot; + blk_mq_start_request(req); treq->req_header.dword_0 |= cpu_to_be32(task_tag); -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v32 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-31 09:18, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Reviewed-by: Can Guo Reviewed-by: Bean Huo Signed-off-by: Daejun Park --- Please allow me a few more days in April to have this whole series tested one more time. Thanks, Can Guo.
[PATCH v2 1/2] scsi: ufs: Introduce hba performance monitor sysfs nodes
Add a new sysfs group which has nodes to monitor data/request transfer performance. This sysfs group has nodes showing total sectors/requests transferred, total busy time spent and max/min/avg/sum latencies. This group can be enhanced later to show more UFS driver layer performance statistics data during runtime. Signed-off-by: Can Guo --- drivers/scsi/ufs/ufs-sysfs.c | 237 +++ drivers/scsi/ufs/ufshcd.c| 62 +++ drivers/scsi/ufs/ufshcd.h| 21 3 files changed, 320 insertions(+) diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index acc54f5..1f93f3e 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -278,6 +278,242 @@ static const struct attribute_group ufs_sysfs_default_group = { .attrs = ufs_sysfs_ufshcd_attrs, }; +static ssize_t monitor_enable_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%d\n", hba->monitor.enabled); +} + +static ssize_t monitor_enable_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long value, flags; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + value = !!value; + spin_lock_irqsave(hba->host->host_lock, flags); + if (value == hba->monitor.enabled) + goto out_unlock; + + if (!value) { + memset(>monitor, 0, sizeof(hba->monitor)); + } else { + hba->monitor.enabled = true; + hba->monitor.enabled_ts = ktime_get(); + } + +out_unlock: + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t monitor_chunk_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.chunk_size); +} + +static ssize_t monitor_chunk_size_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + unsigned long value, flags; + + if (kstrtoul(buf, 0, )) + return -EINVAL; + + spin_lock_irqsave(hba->host->host_lock, flags); + /* Only allow chunk size change when monitor is disabled */ + if (!hba->monitor.enabled) + hba->monitor.chunk_size = value; + spin_unlock_irqrestore(hba->host->host_lock, flags); + return count; +} + +static ssize_t read_total_sectors_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.nr_sec_rw[READ]); +} + +static ssize_t read_total_busy_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.total_busy[READ])); +} + +static ssize_t read_nr_requests_show(struct device *dev, +struct device_attribute *attr, char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%lu\n", hba->monitor.nr_req[READ]); +} + +static ssize_t read_req_latency_avg_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + struct ufs_hba_monitor *m = >monitor; + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(m->lat_sum[READ]) / m->nr_req[READ]); +} + +static ssize_t read_req_latency_max_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor.lat_max[READ])); +} + +static ssize_t read_req_latency_min_show(struct device *dev, +struct device_attribute *attr, +char *buf) +{ + struct ufs_hba *hba = dev_get_drvdata(dev); + + return sysfs_emit(buf, "%llu\n", + ktime_to_us(hba->monitor
[PATCH v2 2/2] scsi: ufs: Add support for hba performance monitor
Add a new sysfs group which has nodes to monitor data/request transfer performance. This sysfs group has nodes showing total sectors/requests transferred, total busy time spent and max/min/avg/sum latencies. Signed-off-by: Can Guo --- Documentation/ABI/testing/sysfs-driver-ufs | 126 + 1 file changed, 126 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index d1bc23c..8380866 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -995,6 +995,132 @@ Description: This entry shows the target state of an UFS UIC link The file is read only. +What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_enable +Date: January 2021 +Contact: Can Guo +Description: This file shows the status of performance monitor enablement + and it can be used to start/stop the monitor. When the monitor + is stopped, the performance data collected is also cleared. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/monitor_chunk_size +Date: January 2021 +Contact: Can Guo +Description: This file tells the monitor to focus on requests transferring + data of specific chunk size (in Bytes). 0 means any chunk size. + It can only be changed when monitor is disabled. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_sectors +Date: January 2021 +Contact: Can Guo +Description: This file shows how many sectors (in 512 Bytes) have been + sent from device to host after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_total_busy +Date: January 2021 +Contact: Can Guo +Description: This file shows how long (in micro seconds) has been spent + sending data from device to host after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_nr_requests +Date: January 2021 +Contact: Can Guo +Description: This file shows how many read requests have been sent after + monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_max +Date: January 2021 +Contact: Can Guo +Description: This file shows the maximum latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_min +Date: January 2021 +Contact: Can Guo +Description: This file shows the minimum latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_avg +Date: January 2021 +Contact: Can Guo +Description: This file shows the average latency (in micro seconds) of + read requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/read_req_latency_sum +Date: January 2021 +Contact: Can Guo +Description: This file shows the total latency (in micro seconds) of + read requests sent after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_sectors +Date: January 2021 +Contact: Can Guo +Description: This file shows how many sectors (in 512 Bytes) have been sent + from host to device after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_total_busy +Date: January 2021 +Contact: Can Guo +Description: This file shows how long (in micro seconds) has been spent + sending data from host to device after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_nr_requests +Date: January 2021 +Contact: Can Guo +Description: This file shows how many write requests have been sent after + monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_max +Date: January 2021 +Contact: Can Guo +Description: This file shows the maximum latency (in micro seconds) of write + requests after monitor gets started. + + The file is read only. + +What: /sys/bus/platform/drivers/ufshcd/*/monitor/write_req_latency_min +Date: January 2021 +Contact: Can Guo +Description
Re: [PATCH v31 4/4] scsi: ufs: Add HPB 2.0 support
t;rgn_state_lock, flags); > + > +pre_req->req = req; > + > +ret = ufshpb_execute_pre_req(hpb, cmd, pre_req, _read_id); > +if (ret) > +goto free_pre_req; > + > +*read_id = _read_id; > + > +return ret; > +free_pre_req: > +spin_lock_irqsave(>rgn_state_lock, flags); > +ufshpb_put_pre_req(hpb, pre_req); > +unlock_out: > +spin_unlock_irqrestore(>rgn_state_lock, flags); > +blk_put_request(req); > +return ret; > +} > + > /* > * This function will set up HPB read command using host-side L2P map > data. > - * In HPB v1.0, maximum size of HPB read command is 4KB. > */ > -void ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) > +int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) > { > struct ufshpb_lu *hpb; > struct ufshpb_region *rgn; > @@ -291,19 +552,20 @@ void ufshpb_prep(struct ufs_hba *hba, struct > ufshcd_lrb *lrbp) > u64 ppn; > unsigned long flags; > int transfer_len, rgn_idx, srgn_idx, srgn_offset; > +int read_id = 0; > int err = 0; > > hpb = ufshpb_get_hpb_data(cmd->device); > if (!hpb) > -return; > +return -ENODEV; > > if (ufshpb_get_state(hpb) == HPB_INIT) > -return; > +return -ENODEV; > > if (ufshpb_get_state(hpb) != HPB_PRESENT) { > dev_notice(>sdev_ufs_lu->sdev_dev, > "%s: ufshpb state is not PRESENT", __func__); > -return; > +return -ENODEV; > } > > if (blk_rq_is_scsi(cmd->request) || > @@ -314,7 +576,7 @@ void ufshpb_prep(struct ufs_hba *hba, struct > ufshcd_lrb *lrbp) > transfer_len = sectors_to_logical(cmd->device, >blk_rq_sectors(cmd->request)); > if (unlikely(!transfer_len)) > -return; > +return 0; > > lpn = sectors_to_logical(cmd->device, blk_rq_pos(cmd->request)); > ufshpb_get_pos_from_lpn(hpb, lpn, _idx, _idx, _offset); > @@ -327,18 +589,18 @@ void ufshpb_prep(struct ufs_hba *hba, struct > ufshcd_lrb *lrbp) > ufshpb_set_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, > transfer_len); > spin_unlock_irqrestore(>rgn_state_lock, flags); > -return; > +return 0; > } > > -if (!ufshpb_is_support_chunk(transfer_len)) > -return; > +if (!ufshpb_is_support_chunk(hpb, transfer_len)) > +return 0; > > spin_lock_irqsave(>rgn_state_lock, flags); > if (ufshpb_test_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, > transfer_len)) { > hpb->stats.miss_cnt++; > spin_unlock_irqrestore(>rgn_state_lock, flags); > -return; > +return 0; > } > > err = ufshpb_fill_ppn_from_page(hpb, srgn->mctx, srgn_offset, 1, > ); > @@ -351,64 +613,101 @@ void ufshpb_prep(struct ufs_hba *hba, struct > ufshcd_lrb *lrbp) > * active state. > */ > dev_err(hba->dev, "get ppn failed. err %d\n", err); > -return; > +return err; > +} > +if (!ufshpb_is_legacy(hba) && > +ufshpb_is_required_wb(hpb, transfer_len)) { > +err = ufshpb_issue_pre_req(hpb, cmd, _id); > +if (err) { > +unsigned long timeout; > + > +timeout = cmd->jiffies_at_alloc + msecs_to_jiffies( > + hpb->params.requeue_timeout_ms); > + > +if (time_before(jiffies, timeout)) > +return -EAGAIN; > + > +hpb->stats.miss_cnt++; > +return 0; > +} > } > > -ufshpb_set_hpb_read_to_upiu(hpb, lrbp, lpn, ppn, transfer_len); > +ufshpb_set_hpb_read_to_upiu(hpb, lrbp, lpn, ppn, transfer_len, > read_id); > > hpb->stats.hit_cnt++; > +return 0; > } > -static struct ufshpb_req *ufshpb_get_map_req(struct ufshpb_lu *hpb, > - struct ufshpb_subregion *srgn) > + > +static struct ufshpb_req *ufshpb_get_req(struct ufshpb_lu *hpb, > + int rgn_idx, enum req_opf
Re: [PATCH v6 04/10] scsi: ufshpb: Make eviction depends on region's reads
On 2021-03-22 16:10, Avri Altman wrote: In host mode, eviction is considered an extreme measure. verify that the entering region has enough reads, and the exiting region has much less reads. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index a1519cbb4ce0..5e757220d66a 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -17,6 +17,7 @@ #include "../sd.h" #define ACTIVATION_THRESHOLD 8 /* 8 IOs */ +#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 5) /* 256 IOs */ /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -1047,6 +1048,13 @@ static struct ufshpb_region *ufshpb_victim_lru_info(struct ufshpb_lu *hpb) if (ufshpb_check_srgns_issue_state(hpb, rgn)) continue; + /* +* in host control mode, verify that the exiting region +* has less reads +*/ + if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1)) + continue; + victim_rgn = rgn; break; } @@ -1219,7 +1227,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu *hpb, static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) { - struct ufshpb_region *victim_rgn; + struct ufshpb_region *victim_rgn = NULL; struct victim_select_info *lru_info = >lru_info; unsigned long flags; int ret = 0; @@ -1246,7 +1254,15 @@ static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) * It is okay to evict the least recently used region, * because the device could detect this region * by not issuing HPB_READ +* +* in host control mode, verify that the entering +* region has enough reads */ + if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) { + ret = -EACCES; + goto out; + } + I cannot understand the logic behind this. A rgn which host chooses to activate, is in INACTIVE state now, if its rgn->reads < 256, then don't activate it. Could you please elaborate? Thanks, Can Guo. victim_rgn = ufshpb_victim_lru_info(hpb); if (!victim_rgn) { dev_warn(>sdev_ufs_lu->sdev_dev,
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-24 17:33, Bean Huo wrote: On Wed, 2021-03-24 at 17:24 +0800, Can Guo wrote: On 2021-03-24 16:37, Bean Huo wrote: > On Wed, 2021-03-24 at 09:45 +0800, Can Guo wrote: > > On 2021-03-23 20:48, Avri Altman wrote: > > > > > > On 2021-03-23 14:37, Daejun Park wrote: > > > > > > On 2021-03-23 14:19, Daejun Park wrote: > > > > > > > > On 2021-03-23 13:37, Daejun Park wrote: > > > > > > > > > > On 2021-03-23 12:22, Can Guo wrote: > > > > > > > > > > > On 2021-03-22 17:11, Bean Huo wrote: > > > > > > > > > > > > On Mon, 2021-03-22 at 15:54 +0900, Daejun > > > > > > > > > > > > Park > > > > > > > > > > > > wrote: > > > > > > > > > > > > > + switch (rsp_field->hpb_op) { > > > > > > > > > > > > > + case HPB_RSP_REQ_REGION_UPDATE: > > > > > > > > > > > > > + if (data_seg_len != > > > > > > > > > > > > > DEV_DATA_SEG_LEN) > > > > > > > > > > > > > + dev_warn( > > > > > > > > > > > > > > sdev_ufs_lu->sdev_dev, > > > > > > > > > > > > > +"%s: data > > > > > > > > > > > > > seg > > > > > > > > > > > > > length is not > > > > > > > > > > > > > same.\n", > > > > > > > > > > > > > +__func__); > > > > > > > > > > > > > + > > > > > > > > > > > > > ufshpb_rsp_req_region_update(hpb, > > > > > > > > > > > > > rsp_field); > > > > > > > > > > > > > + break; > > > > > > > > > > > > > + case HPB_RSP_DEV_RESET: > > > > > > > > > > > > > + dev_warn(>sdev_ufs_lu- > > > > > > > > > > > > > > sdev_dev, > > > > > > > > > > > > > +"UFS device lost > > > > > > > > > > > > > HPB > > > > > > > > > > > > > information > > > > > > > > > > > > > during > > > > > > > > > > > > > PM.\n"); > > > > > > > > > > > > > + break; > > > > > > > > > > > > Hi Deajun, > > > > > > > > > > > > This series looks good to me. Just here I > > > > > > > > > > > > have > > > > > > > > > > > > one question. You > > > > > > > > > > > > didn't > > > > > > > > > > > > handle HPB_RSP_DEV_RESET, just a > > > > > > > > > > > > warning. Based > > > > > > > > > > > > on your SS UFS, > > > > > > > > > > > > how > > > > > > > > > > > > to > > > > > > > > > > > > handle HPB_RSP_DEV_RESET from the host side? > > > > > > > > > > > > Do > > > > > > > > > > > > you think we > > > > > > > > > > > > shoud > > > > > > > > > > > > reset host side HPB entry as well or what > > > > > > > > > > > > else? > > > > > > > > > > > > Bean > > > > > > > > > > > Same question here - I am still collecting > > > > > > > > > > > feedbacks from flash > > > > > > > > > > > vendors > > > > > > > > > > > about > > > > > > > > > > > what is recommanded host behavior on reception > > > > > > > > > > > of > > > > > > > > > > > HPB Op code > > > > > > > > > > > 0x2, > > > > > > > > > > > since it > > > > > > &g
Re: [PATCH v6 03/10] scsi: ufshpb: Add region's reads counter
On 2021-03-22 16:10, Avri Altman wrote: In host control mode, reads are the major source of activation trials. Keep track of those reads counters, for both active as well inactive regions. We reset the read counter upon write - we are only interested in "clean" reads. Keep those counters normalized, as we are using those reads as a comparative score, to make various decisions. If during consecutive normalizations an active region has exhaust its reads - inactivate it. while at it, protect the {active,inactive}_count stats by adding them into the applicable handler. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 100 +++--- drivers/scsi/ufs/ufshpb.h | 5 ++ 2 files changed, 88 insertions(+), 17 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index d4f0bb6d8fa1..a1519cbb4ce0 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,6 +16,8 @@ #include "ufshpb.h" #include "../sd.h" +#define ACTIVATION_THRESHOLD 8 /* 8 IOs */ + /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; static mempool_t *ufshpb_mctx_pool; @@ -546,6 +548,23 @@ static int ufshpb_issue_pre_req(struct ufshpb_lu *hpb, struct scsi_cmnd *cmd, return ret; } +static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int rgn_idx, + int srgn_idx) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + list_del_init(>list_inact_rgn); + + if (list_empty(>list_act_srgn)) + list_add_tail(>list_act_srgn, >lh_act_srgn); + + hpb->stats.rb_active_cnt++; +} + /* * This function will set up HPB read command using host-side L2P map data. */ @@ -596,12 +615,43 @@ int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) ufshpb_set_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, transfer_len); spin_unlock_irqrestore(>rgn_state_lock, flags); + + if (hpb->is_hcm) { + spin_lock(>rgn_lock); + rgn->reads = 0; + spin_unlock(>rgn_lock); + } + return 0; } if (!ufshpb_is_support_chunk(hpb, transfer_len)) return 0; + if (hpb->is_hcm) { + bool activate = false; + /* +* in host control mode, reads are the main source for +* activation trials. +*/ + spin_lock(>rgn_lock); + rgn->reads++; + if (rgn->reads == ACTIVATION_THRESHOLD) + activate = true; + spin_unlock(>rgn_lock); + if (activate) { + spin_lock_irqsave(>rsp_list_lock, flags); + ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); If a transfer_len (possible with HPB2.0) sits accross two regions/sub-regions, here it only updates active info of the first region/sub-region. Thanks, Can Guo. + spin_unlock_irqrestore(>rsp_list_lock, flags); + dev_dbg(>sdev_ufs_lu->sdev_dev, + "activate region %d-%d\n", rgn_idx, srgn_idx); + } + + /* keep those counters normalized */ + if (rgn->reads > hpb->entries_per_srgn) + schedule_work(>ufshpb_normalization_work); + } + spin_lock_irqsave(>rgn_state_lock, flags); if (ufshpb_test_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, transfer_len)) { @@ -741,21 +791,6 @@ static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, return 0; } -static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int rgn_idx, - int srgn_idx) -{ - struct ufshpb_region *rgn; - struct ufshpb_subregion *srgn; - - rgn = hpb->rgn_tbl + rgn_idx; - srgn = rgn->srgn_tbl + srgn_idx; - - list_del_init(>list_inact_rgn); - - if (list_empty(>list_act_srgn)) - list_add_tail(>list_act_srgn, >lh_act_srgn); -} - static void ufshpb_update_inactive_info(struct ufshpb_lu *hpb, int rgn_idx) { struct ufshpb_region *rgn; @@ -769,6 +804,8 @@ static void ufshpb_update_inactive_info(struct ufshpb_lu *hpb, int rgn_idx) if (list_empty(>list_inact_rgn)) list_add_tail(>list_inact_rgn, >lh_inact_rgn); + + hpb->stats.rb_inactive_cnt++; } static void ufshpb_activate_subregion(struct ufshpb_lu *hpb, @@ -1089,6 +1126,7 @@ static int ufshpb_evict_region(struct ufshpb_lu
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-24 16:37, Bean Huo wrote: On Wed, 2021-03-24 at 09:45 +0800, Can Guo wrote: On 2021-03-23 20:48, Avri Altman wrote: > > On 2021-03-23 14:37, Daejun Park wrote: > > > > On 2021-03-23 14:19, Daejun Park wrote: > > > > > > On 2021-03-23 13:37, Daejun Park wrote: > > > > > > > > On 2021-03-23 12:22, Can Guo wrote: > > > > > > > > > On 2021-03-22 17:11, Bean Huo wrote: > > > > > > > > > > On Mon, 2021-03-22 at 15:54 +0900, Daejun Park > > > > > > > > > > wrote: > > > > > > > > > > > + switch (rsp_field->hpb_op) { > > > > > > > > > > > + case HPB_RSP_REQ_REGION_UPDATE: > > > > > > > > > > > + if (data_seg_len != > > > > > > > > > > > DEV_DATA_SEG_LEN) > > > > > > > > > > > + dev_warn( > > > > > > > > > > > >sdev_ufs_lu->sdev_dev, > > > > > > > > > > > +"%s: data seg > > > > > > > > > > > length is not > > > > > > > > > > > same.\n", > > > > > > > > > > > +__func__); > > > > > > > > > > > + > > > > > > > > > > > ufshpb_rsp_req_region_update(hpb, rsp_field); > > > > > > > > > > > + break; > > > > > > > > > > > + case HPB_RSP_DEV_RESET: > > > > > > > > > > > + dev_warn(>sdev_ufs_lu- > > > > > > > > > > > >sdev_dev, > > > > > > > > > > > +"UFS device lost HPB > > > > > > > > > > > information > > > > > > > > > > > during > > > > > > > > > > > PM.\n"); > > > > > > > > > > > + break; > > > > > > > > > > Hi Deajun, > > > > > > > > > > This series looks good to me. Just here I have > > > > > > > > > > one question. You > > > > > > > > > > didn't > > > > > > > > > > handle HPB_RSP_DEV_RESET, just a warning. Based > > > > > > > > > > on your SS UFS, > > > > > > > > > > how > > > > > > > > > > to > > > > > > > > > > handle HPB_RSP_DEV_RESET from the host side? Do > > > > > > > > > > you think we > > > > > > > > > > shoud > > > > > > > > > > reset host side HPB entry as well or what else? > > > > > > > > > > Bean > > > > > > > > > Same question here - I am still collecting > > > > > > > > > feedbacks from flash > > > > > > > > > vendors > > > > > > > > > about > > > > > > > > > what is recommanded host behavior on reception of > > > > > > > > > HPB Op code > > > > > > > > > 0x2, > > > > > > > > > since it > > > > > > > > > is not cleared defined in HPB2.0 specs. > > > > > > > > > Can Guo. > > > > > > > > I think the question should be asked in the HPB2.0 > > > > > > > > patch, since in > > > > > > > > HPB1.0 device > > > > > > > > control mode, a HPB reset in device side does not > > > > > > > > impact anything > > > > > > > > in > > > > > > > > host side - > > > > > > > > host is not writing back any HPB entries to device > > > > > > > > anyways and HPB > > > > > > > > Read > > > > > > > > cmd with > > > > > > > > invalid HPB entries shall be treated as normal > > > > > > > > Read(10) cmd > > > > > > > > without > > > > > > > > any > > > > > > > > problems. > > > > > > > Yes, UFS device will process read command even t
Re: [PATCH v31 4/4] scsi: ufs: Add HPB 2.0 support
_jiffies( + hpb->params.requeue_timeout_ms); + + if (time_before(jiffies, timeout)) + return -EAGAIN; + + hpb->stats.miss_cnt++; + return 0; + } } - ufshpb_set_hpb_read_to_upiu(hpb, lrbp, lpn, ppn, transfer_len); + ufshpb_set_hpb_read_to_upiu(hpb, lrbp, lpn, ppn, transfer_len, read_id); hpb->stats.hit_cnt++; + return 0; } -static struct ufshpb_req *ufshpb_get_map_req(struct ufshpb_lu *hpb, -struct ufshpb_subregion *srgn) + +static struct ufshpb_req *ufshpb_get_req(struct ufshpb_lu *hpb, +int rgn_idx, enum req_opf dir, +bool atomic) You didn't mention this change in cover letter. And I don't see anyone is passing "atomic" as true, neither in your patches nor Avri's V6 series (from ufshpb_issue_umap_single_req()). If no one is using the flag, then this is dead code. If Avri needs this flag, he can add it in host control mode patches. Do I miss anything? Thanks, Can Guo. { - struct ufshpb_req *map_req; + struct ufshpb_req *rq; struct request *req; - struct bio *bio; int retries = HPB_MAP_REQ_RETRIES; - map_req = kmem_cache_alloc(hpb->map_req_cache, GFP_KERNEL); - if (!map_req) + rq = kmem_cache_alloc(hpb->map_req_cache, GFP_ATOMIC); + if (!rq) return NULL; retry: - req = blk_get_request(hpb->sdev_ufs_lu->request_queue, - REQ_OP_SCSI_IN, BLK_MQ_REQ_NOWAIT); + req = blk_get_request(hpb->sdev_ufs_lu->request_queue, dir, + BLK_MQ_REQ_NOWAIT); - if ((PTR_ERR(req) == -EWOULDBLOCK) && (--retries > 0)) { + if (!atomic && (PTR_ERR(req) == -EWOULDBLOCK) && (--retries > 0)) { usleep_range(3000, 3100); goto retry; } if (IS_ERR(req)) - goto free_map_req; + goto free_rq; + + rq->hpb = hpb; + rq->req = req; + rq->rb.rgn_idx = rgn_idx; + + return rq; + +free_rq: + kmem_cache_free(hpb->map_req_cache, rq); + return NULL; +} + +static void ufshpb_put_req(struct ufshpb_lu *hpb, struct ufshpb_req *rq) +{ + blk_put_request(rq->req); + kmem_cache_free(hpb->map_req_cache, rq); +} + +static struct ufshpb_req *ufshpb_get_map_req(struct ufshpb_lu *hpb, +struct ufshpb_subregion *srgn) +{ + struct ufshpb_req *map_req; + struct bio *bio; + + map_req = ufshpb_get_req(hpb, srgn->rgn_idx, REQ_OP_SCSI_IN, false); + if (!map_req) + return NULL; bio = bio_alloc(GFP_KERNEL, hpb->pages_per_srgn); if (!bio) { - blk_put_request(req); - goto free_map_req; + ufshpb_put_req(hpb, map_req); + return NULL; } - map_req->hpb = hpb; - map_req->req = req; map_req->bio = bio; - map_req->rgn_idx = srgn->rgn_idx; - map_req->srgn_idx = srgn->srgn_idx; - map_req->mctx = srgn->mctx; + map_req->rb.srgn_idx = srgn->srgn_idx; + map_req->rb.mctx = srgn->mctx; return map_req; - -free_map_req: - kmem_cache_free(hpb->map_req_cache, map_req); - return NULL; } static void ufshpb_put_map_req(struct ufshpb_lu *hpb, struct ufshpb_req *map_req) { bio_put(map_req->bio); - blk_put_request(map_req->req); - kmem_cache_free(hpb->map_req_cache, map_req); + ufshpb_put_req(hpb, map_req); } static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, @@ -491,6 +790,13 @@ static void ufshpb_activate_subregion(struct ufshpb_lu *hpb, srgn->srgn_state = HPB_SRGN_VALID; } +static void ufshpb_umap_req_compl_fn(struct request *req, blk_status_t error) +{ + struct ufshpb_req *umap_req = (struct ufshpb_req *)req->end_io_data; + + ufshpb_put_req(umap_req->hpb, umap_req); +} + static void ufshpb_map_req_compl_fn(struct request *req, blk_status_t error) { struct ufshpb_req *map_req = (struct ufshpb_req *) req->end_io_data; @@ -498,8 +804,8 @@ static void ufshpb_map_req_compl_fn(struct request *req, blk_status_t error) struct ufshpb_subregion *srgn; unsigned long flags; - srgn = hpb->rgn_tbl[map_req->rgn_idx].srgn_tbl + - map_req->srgn_idx; + srgn = hpb->rgn_tbl[map_req->rb.rgn_idx].srgn_tbl + + map_req->rb.srgn_idx; ufshpb_clear_dirty_bitmap(hpb, srgn); spin_lock_irqsave(>rgn_state_lock, flags); @@ -509,6 +815,16 @@ static void ufshpb_map_req_compl_f
Re: [PATCH v6 02/10] scsi: ufshpb: Add host control mode support to rsp_upiu
On 2021-03-24 11:31, Zang Leigang wrote: On Mon, Mar 22, 2021 at 10:10:36AM +0200, Avri Altman wrote: In device control mode, the device may recommend the host to either activate or inactivate a region, and the host should follow. Meaning those are not actually recommendations, but more of instructions. On the contrary, in host control mode, the recommendation protocol is slightly changed: a) The device may only recommend the host to update a subregion of an already-active region. And, b) The device may *not* recommend to inactivate a region. Furthermore, in host control mode, the host may choose not to follow any of the device's recommendations. However, in case of a recommendation to update an active and clean subregion, it is better to follow those recommendation because otherwise the host has no other way to know that some internal relocation took place. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 34 +- drivers/scsi/ufs/ufshpb.h | 2 ++ 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index fb10afcbb49f..d4f0bb6d8fa1 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -166,6 +166,8 @@ static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, else set_bit_len = cnt; + set_bit(RGN_FLAG_DIRTY, >rgn_flags); + if (rgn->rgn_state != HPB_RGN_INACTIVE && srgn->srgn_state == HPB_SRGN_VALID) bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len); @@ -235,6 +237,11 @@ static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, return false; } +static inline bool is_rgn_dirty(struct ufshpb_region *rgn) +{ + return test_bit(RGN_FLAG_DIRTY, >rgn_flags); +} + static int ufshpb_fill_ppn_from_page(struct ufshpb_lu *hpb, struct ufshpb_map_ctx *mctx, int pos, int len, u64 *ppn_buf) @@ -713,6 +720,7 @@ static void ufshpb_put_map_req(struct ufshpb_lu *hpb, static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, struct ufshpb_subregion *srgn) { + struct ufshpb_region *rgn; u32 num_entries = hpb->entries_per_srgn; if (!srgn->mctx) { @@ -726,6 +734,10 @@ static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, num_entries = hpb->last_srgn_entries; bitmap_zero(srgn->mctx->ppn_dirty, num_entries); + + rgn = hpb->rgn_tbl + srgn->rgn_idx; + clear_bit(RGN_FLAG_DIRTY, >rgn_flags); + return 0; } @@ -1245,6 +1257,18 @@ static void ufshpb_rsp_req_region_update(struct ufshpb_lu *hpb, srgn_i = be16_to_cpu(rsp_field->hpb_active_field[i].active_srgn); + rgn = hpb->rgn_tbl + rgn_i; + if (hpb->is_hcm && + (rgn->rgn_state != HPB_RGN_ACTIVE || is_rgn_dirty(rgn))) { + /* +* in host control mode, subregion activation +* recommendations are only allowed to active regions. +* Also, ignore recommendations for dirty regions - the +* host will make decisions concerning those by himself +*/ + continue; + } + Hi Avri, host control mode also need the recommendations from device, because the bkops would make the ppn invalid, is that right? Right, but ONLY recommandations to ACTIVE regions are of host's interest in host control mode. For those inactive regions, host makes its own decisions. Can Guo. dev_dbg(>sdev_ufs_lu->sdev_dev, "activate(%d) region %d - %d\n", i, rgn_i, srgn_i); @@ -1252,7 +1276,6 @@ static void ufshpb_rsp_req_region_update(struct ufshpb_lu *hpb, ufshpb_update_active_info(hpb, rgn_i, srgn_i); spin_unlock(>rsp_list_lock); - rgn = hpb->rgn_tbl + rgn_i; srgn = rgn->srgn_tbl + srgn_i; /* blocking HPB_READ */ @@ -1263,6 +1286,14 @@ static void ufshpb_rsp_req_region_update(struct ufshpb_lu *hpb, hpb->stats.rb_active_cnt++; } + if (hpb->is_hcm) { + /* +* in host control mode the device is not allowed to inactivate +* regions +*/ + goto out; + } + for (i = 0; i < rsp_field->inactive_rgn_cnt; i++) { rgn_i = be16_to_cpu(rsp_field->hpb_inactive_field[i]); dev_dbg(>sdev_ufs_lu->sdev_dev, @@ -1287,6 +1318,7 @@ static void ufshpb_rsp_req_region_update(struct ufshpb_lu *hpb, hpb->stats.rb_inactive_cnt++;
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-23 20:48, Avri Altman wrote: On 2021-03-23 14:37, Daejun Park wrote: >> On 2021-03-23 14:19, Daejun Park wrote: >>>> On 2021-03-23 13:37, Daejun Park wrote: >>>>>> On 2021-03-23 12:22, Can Guo wrote: >>>>>>> On 2021-03-22 17:11, Bean Huo wrote: >>>>>>>> On Mon, 2021-03-22 at 15:54 +0900, Daejun Park wrote: >>>>>>>>> + switch (rsp_field->hpb_op) { >>>>>>>>> >>>>>>>>> + case HPB_RSP_REQ_REGION_UPDATE: >>>>>>>>> >>>>>>>>> + if (data_seg_len != DEV_DATA_SEG_LEN) >>>>>>>>> >>>>>>>>> + dev_warn(>sdev_ufs_lu->sdev_dev, >>>>>>>>> >>>>>>>>> +"%s: data seg length is not >>>>>>>>> same.\n", >>>>>>>>> >>>>>>>>> +__func__); >>>>>>>>> >>>>>>>>> + ufshpb_rsp_req_region_update(hpb, rsp_field); >>>>>>>>> >>>>>>>>> + break; >>>>>>>>> >>>>>>>>> + case HPB_RSP_DEV_RESET: >>>>>>>>> >>>>>>>>> + dev_warn(>sdev_ufs_lu->sdev_dev, >>>>>>>>> >>>>>>>>> +"UFS device lost HPB information >>>>>>>>> during >>>>>>>>> PM.\n"); >>>>>>>>> >>>>>>>>> + break; >>>>>>>> >>>>>>>> Hi Deajun, >>>>>>>> This series looks good to me. Just here I have one question. You >>>>>>>> didn't >>>>>>>> handle HPB_RSP_DEV_RESET, just a warning. Based on your SS UFS, >>>>>>>> how >>>>>>>> to >>>>>>>> handle HPB_RSP_DEV_RESET from the host side? Do you think we >>>>>>>> shoud >>>>>>>> reset host side HPB entry as well or what else? >>>>>>>> >>>>>>>> >>>>>>>> Bean >>>>>>> >>>>>>> Same question here - I am still collecting feedbacks from flash >>>>>>> vendors >>>>>>> about >>>>>>> what is recommanded host behavior on reception of HPB Op code >>>>>>> 0x2, >>>>>>> since it >>>>>>> is not cleared defined in HPB2.0 specs. >>>>>>> >>>>>>> Can Guo. >>>>>> >>>>>> I think the question should be asked in the HPB2.0 patch, since in >>>>>> HPB1.0 device >>>>>> control mode, a HPB reset in device side does not impact anything >>>>>> in >>>>>> host side - >>>>>> host is not writing back any HPB entries to device anyways and HPB >>>>>> Read >>>>>> cmd with >>>>>> invalid HPB entries shall be treated as normal Read(10) cmd >>>>>> without >>>>>> any >>>>>> problems. >>>>> >>>>> Yes, UFS device will process read command even the HPB entries are >>>>> valid or >>>>> not. So it is warning about read performance drop by dev reset. >>>> >>>> Yeah, but still I am 100% sure about what should host do in case of >>>> HPB2.0 >>>> when it receives HPB Op code 0x2, I am waiting for feedbacks. >>> >>> I think the host has two choices when it receives 0x2. >>> One is nothing on host. >>> The other is discarding all HPB entries in the host. >>> >>> In the JEDEC HPB spec, it as follows: >>> When the device is powered off by the host, the device may restore >>> L2P >>> map >>> data upon power up or build from the host’s HPB READ command. >>> >>> If some UFS builds L2P map data from the host's HPB READ commands, we >>> don't >>> have to discard HPB entries in the host. >>> >>> So I thinks there is nothing to do when it receives 0x2. >> >> But in HPB2.0, if we do nothing to active regio
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-23 14:37, Daejun Park wrote: On 2021-03-23 14:19, Daejun Park wrote: On 2021-03-23 13:37, Daejun Park wrote: On 2021-03-23 12:22, Can Guo wrote: On 2021-03-22 17:11, Bean Huo wrote: On Mon, 2021-03-22 at 15:54 +0900, Daejun Park wrote: + switch (rsp_field->hpb_op) { + case HPB_RSP_REQ_REGION_UPDATE: + if (data_seg_len != DEV_DATA_SEG_LEN) + dev_warn(>sdev_ufs_lu->sdev_dev, +"%s: data seg length is not same.\n", +__func__); + ufshpb_rsp_req_region_update(hpb, rsp_field); + break; + case HPB_RSP_DEV_RESET: + dev_warn(>sdev_ufs_lu->sdev_dev, +"UFS device lost HPB information during PM.\n"); + break; Hi Deajun, This series looks good to me. Just here I have one question. You didn't handle HPB_RSP_DEV_RESET, just a warning. Based on your SS UFS, how to handle HPB_RSP_DEV_RESET from the host side? Do you think we shoud reset host side HPB entry as well or what else? Bean Same question here - I am still collecting feedbacks from flash vendors about what is recommanded host behavior on reception of HPB Op code 0x2, since it is not cleared defined in HPB2.0 specs. Can Guo. I think the question should be asked in the HPB2.0 patch, since in HPB1.0 device control mode, a HPB reset in device side does not impact anything in host side - host is not writing back any HPB entries to device anyways and HPB Read cmd with invalid HPB entries shall be treated as normal Read(10) cmd without any problems. Yes, UFS device will process read command even the HPB entries are valid or not. So it is warning about read performance drop by dev reset. Yeah, but still I am 100% sure about what should host do in case of HPB2.0 when it receives HPB Op code 0x2, I am waiting for feedbacks. I think the host has two choices when it receives 0x2. One is nothing on host. The other is discarding all HPB entries in the host. In the JEDEC HPB spec, it as follows: When the device is powered off by the host, the device may restore L2P map data upon power up or build from the host’s HPB READ command. If some UFS builds L2P map data from the host's HPB READ commands, we don't have to discard HPB entries in the host. So I thinks there is nothing to do when it receives 0x2. But in HPB2.0, if we do nothing to active regions in host side, host can write HPB entries (which host thinks valid, but actually invalid in device side since reset happened) back to device through HPB Write Buffer cmds (BUFFER ID = 0x2). My question is that are all UFSs OK with this? Yes, it must be OK. Please refer the following the HPB 2.0 spec: If the HPB Entries sent by HPB WRITE BUFFER are removed by the device, for example, because they are not consumed for a long enough period of time, then the HPB READ command for the removed HPB entries shall be handled as a normal READ command. No, it is talking about the subsequent HPB READ cmd sent after a HPB WRITE BUFFER cmd, but not the HPB WRITE BUFFER cmd itself... Thanks, Can Guo. Thanks, Daejun Thanks, Can Guo. Thanks, Daejun Thanks, Can Guo. Thanks, Daejun Please correct me if I am wrong. Thanks, Can Guo.
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-23 14:19, Daejun Park wrote: On 2021-03-23 13:37, Daejun Park wrote: On 2021-03-23 12:22, Can Guo wrote: On 2021-03-22 17:11, Bean Huo wrote: On Mon, 2021-03-22 at 15:54 +0900, Daejun Park wrote: + switch (rsp_field->hpb_op) { + case HPB_RSP_REQ_REGION_UPDATE: + if (data_seg_len != DEV_DATA_SEG_LEN) + dev_warn(>sdev_ufs_lu->sdev_dev, +"%s: data seg length is not same.\n", +__func__); + ufshpb_rsp_req_region_update(hpb, rsp_field); + break; + case HPB_RSP_DEV_RESET: + dev_warn(>sdev_ufs_lu->sdev_dev, +"UFS device lost HPB information during PM.\n"); + break; Hi Deajun, This series looks good to me. Just here I have one question. You didn't handle HPB_RSP_DEV_RESET, just a warning. Based on your SS UFS, how to handle HPB_RSP_DEV_RESET from the host side? Do you think we shoud reset host side HPB entry as well or what else? Bean Same question here - I am still collecting feedbacks from flash vendors about what is recommanded host behavior on reception of HPB Op code 0x2, since it is not cleared defined in HPB2.0 specs. Can Guo. I think the question should be asked in the HPB2.0 patch, since in HPB1.0 device control mode, a HPB reset in device side does not impact anything in host side - host is not writing back any HPB entries to device anyways and HPB Read cmd with invalid HPB entries shall be treated as normal Read(10) cmd without any problems. Yes, UFS device will process read command even the HPB entries are valid or not. So it is warning about read performance drop by dev reset. Yeah, but still I am 100% sure about what should host do in case of HPB2.0 when it receives HPB Op code 0x2, I am waiting for feedbacks. I think the host has two choices when it receives 0x2. One is nothing on host. The other is discarding all HPB entries in the host. In the JEDEC HPB spec, it as follows: When the device is powered off by the host, the device may restore L2P map data upon power up or build from the host’s HPB READ command. If some UFS builds L2P map data from the host's HPB READ commands, we don't have to discard HPB entries in the host. So I thinks there is nothing to do when it receives 0x2. But in HPB2.0, if we do nothing to active regions in host side, host can write HPB entries (which host thinks valid, but actually invalid in device side since reset happened) back to device through HPB Write Buffer cmds (BUFFER ID = 0x2). My question is that are all UFSs OK with this? Thanks, Can Guo. Thanks, Daejun Thanks, Can Guo. Thanks, Daejun Please correct me if I am wrong. Thanks, Can Guo.
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-23 13:37, Daejun Park wrote: On 2021-03-23 12:22, Can Guo wrote: On 2021-03-22 17:11, Bean Huo wrote: On Mon, 2021-03-22 at 15:54 +0900, Daejun Park wrote: + switch (rsp_field->hpb_op) { + case HPB_RSP_REQ_REGION_UPDATE: + if (data_seg_len != DEV_DATA_SEG_LEN) + dev_warn(>sdev_ufs_lu->sdev_dev, +"%s: data seg length is not same.\n", +__func__); + ufshpb_rsp_req_region_update(hpb, rsp_field); + break; + case HPB_RSP_DEV_RESET: + dev_warn(>sdev_ufs_lu->sdev_dev, +"UFS device lost HPB information during PM.\n"); + break; Hi Deajun, This series looks good to me. Just here I have one question. You didn't handle HPB_RSP_DEV_RESET, just a warning. Based on your SS UFS, how to handle HPB_RSP_DEV_RESET from the host side? Do you think we shoud reset host side HPB entry as well or what else? Bean Same question here - I am still collecting feedbacks from flash vendors about what is recommanded host behavior on reception of HPB Op code 0x2, since it is not cleared defined in HPB2.0 specs. Can Guo. I think the question should be asked in the HPB2.0 patch, since in HPB1.0 device control mode, a HPB reset in device side does not impact anything in host side - host is not writing back any HPB entries to device anyways and HPB Read cmd with invalid HPB entries shall be treated as normal Read(10) cmd without any problems. Yes, UFS device will process read command even the HPB entries are valid or not. So it is warning about read performance drop by dev reset. Yeah, but still I am 100% sure about what should host do in case of HPB2.0 when it receives HPB Op code 0x2, I am waiting for feedbacks. Thanks, Can Guo. Thanks, Daejun Please correct me if I am wrong. Thanks, Can Guo.
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-23 12:22, Can Guo wrote: On 2021-03-22 17:11, Bean Huo wrote: On Mon, 2021-03-22 at 15:54 +0900, Daejun Park wrote: + switch (rsp_field->hpb_op) { + case HPB_RSP_REQ_REGION_UPDATE: + if (data_seg_len != DEV_DATA_SEG_LEN) + dev_warn(>sdev_ufs_lu->sdev_dev, +"%s: data seg length is not same.\n", +__func__); + ufshpb_rsp_req_region_update(hpb, rsp_field); + break; + case HPB_RSP_DEV_RESET: + dev_warn(>sdev_ufs_lu->sdev_dev, +"UFS device lost HPB information during PM.\n"); + break; Hi Deajun, This series looks good to me. Just here I have one question. You didn't handle HPB_RSP_DEV_RESET, just a warning. Based on your SS UFS, how to handle HPB_RSP_DEV_RESET from the host side? Do you think we shoud reset host side HPB entry as well or what else? Bean Same question here - I am still collecting feedbacks from flash vendors about what is recommanded host behavior on reception of HPB Op code 0x2, since it is not cleared defined in HPB2.0 specs. Can Guo. I think the question should be asked in the HPB2.0 patch, since in HPB1.0 device control mode, a HPB reset in device side does not impact anything in host side - host is not writing back any HPB entries to device anyways and HPB Read cmd with invalid HPB entries shall be treated as normal Read(10) cmd without any problems. Please correct me if I am wrong. Thanks, Can Guo.
Re: [PATCH] drivers: scsi: Remove duplicate include of blkdev.h
On 2021-03-22 20:28, Wan Jiabing wrote: linux/blkdev.h has been included at line 18, so remove the duplicate include at line 27. Signed-off-by: Wan Jiabing --- drivers/scsi/ufs/ufshcd.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index c86760788c72..e8aa7de17d0a 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -24,7 +24,6 @@ #include "ufs_bsg.h" #include "ufshcd-crypto.h" #include -#include #define CREATE_TRACE_POINTS #include Someone has addressed it before you - check https://git.kernel.org/mkp/scsi/c/b4388e3db56a Can Guo.
Re: [PATCH v31 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-22 17:11, Bean Huo wrote: On Mon, 2021-03-22 at 15:54 +0900, Daejun Park wrote: + switch (rsp_field->hpb_op) { + case HPB_RSP_REQ_REGION_UPDATE: + if (data_seg_len != DEV_DATA_SEG_LEN) + dev_warn(>sdev_ufs_lu->sdev_dev, +"%s: data seg length is not same.\n", +__func__); + ufshpb_rsp_req_region_update(hpb, rsp_field); + break; + case HPB_RSP_DEV_RESET: + dev_warn(>sdev_ufs_lu->sdev_dev, +"UFS device lost HPB information during PM.\n"); + break; Hi Deajun, This series looks good to me. Just here I have one question. You didn't handle HPB_RSP_DEV_RESET, just a warning. Based on your SS UFS, how to handle HPB_RSP_DEV_RESET from the host side? Do you think we shoud reset host side HPB entry as well or what else? Bean Same question here - I am still collecting feedbacks from flash vendors about what is recommanded host behavior on reception of HPB Op code 0x2, since it is not cleared defined in HPB2.0 specs. Can Guo.
Re: [PATCH] scsi: ufs: Add selector to ufshcd_query_flag* APIs
On 2021-03-17 11:31, Daejun Park wrote: Unlike other query APIs in UFS, ufshcd_query_flag has a fixed selector as 0. This patch allows ufshcd_query_flag API to choose selector value by parameter. Signed-off-by: Daejun Park Reviewed-by: Can Guo --- drivers/scsi/ufs/ufs-sysfs.c | 2 +- drivers/scsi/ufs/ufshcd.c| 29 + drivers/scsi/ufs/ufshcd.h| 2 +- 3 files changed, 19 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index acc54f530f2d..606b058a3394 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -746,7 +746,7 @@ static ssize_t _name##_show(struct device *dev,\ index = ufshcd_wb_get_query_index(hba); \ pm_runtime_get_sync(hba->dev); \ ret = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG, \ - QUERY_FLAG_IDN##_uname, index, ); \ + QUERY_FLAG_IDN##_uname, index, , 0); \ pm_runtime_put_sync(hba->dev); \ if (ret) { \ ret = -EINVAL; \ diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 8c0ff024231c..c2fd9c58d6b8 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2940,13 +2940,15 @@ static inline void ufshcd_init_query(struct ufs_hba *hba, } static int ufshcd_query_flag_retry(struct ufs_hba *hba, - enum query_opcode opcode, enum flag_idn idn, u8 index, bool *flag_res) + enum query_opcode opcode, enum flag_idn idn, u8 index, bool *flag_res, + u8 selector) { int ret; int retries; for (retries = 0; retries < QUERY_REQ_RETRIES; retries++) { - ret = ufshcd_query_flag(hba, opcode, idn, index, flag_res); + ret = ufshcd_query_flag(hba, opcode, idn, index, flag_res, + selector); if (ret) dev_dbg(hba->dev, "%s: failed with error %d, retries %d\n", @@ -2969,15 +2971,17 @@ static int ufshcd_query_flag_retry(struct ufs_hba *hba, * @idn: flag idn to access * @index: flag index to access * @flag_res: the flag value after the query request completes + * @selector: selector field * * Returns 0 for success, non-zero in case of failure */ int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode, - enum flag_idn idn, u8 index, bool *flag_res) + enum flag_idn idn, u8 index, bool *flag_res, + u8 selector) { struct ufs_query_req *request = NULL; struct ufs_query_res *response = NULL; - int err, selector = 0; + int err; int timeout = QUERY_REQ_TIMEOUT; BUG_ON(!hba); @@ -4331,7 +4335,7 @@ static int ufshcd_complete_dev_init(struct ufs_hba *hba) ktime_t timeout; err = ufshcd_query_flag_retry(hba, UPIU_QUERY_OPCODE_SET_FLAG, - QUERY_FLAG_IDN_FDEVICEINIT, 0, NULL); + QUERY_FLAG_IDN_FDEVICEINIT, 0, NULL, 0); if (err) { dev_err(hba->dev, "%s setting fDeviceInit flag failed with error %d\n", @@ -4343,7 +4347,8 @@ static int ufshcd_complete_dev_init(struct ufs_hba *hba) timeout = ktime_add_ms(ktime_get(), FDEVICEINIT_COMPL_TIMEOUT); do { err = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG, - QUERY_FLAG_IDN_FDEVICEINIT, 0, _res); + QUERY_FLAG_IDN_FDEVICEINIT, 0, _res, + 0); if (!flag_res) break; usleep_range(5000, 1); @@ -5250,7 +5255,7 @@ static int ufshcd_enable_auto_bkops(struct ufs_hba *hba) goto out; err = ufshcd_query_flag_retry(hba, UPIU_QUERY_OPCODE_SET_FLAG, - QUERY_FLAG_IDN_BKOPS_EN, 0, NULL); + QUERY_FLAG_IDN_BKOPS_EN, 0, NULL, 0); if (err) { dev_err(hba->dev, "%s: failed to enable bkops %d\n", __func__, err); @@ -5300,7 +5305,7 @@ static int ufshcd_disable_auto_bkops(struct ufs_hba *hba) } err = ufshcd_query_flag_retry(hba, UPIU_QUERY_OPCODE_CLEAR_FLAG, - QUERY_FLAG_IDN_BKOPS_EN, 0, NULL); + QUERY_FLAG_IDN_BKOPS_EN, 0, NULL, 0); if (err) { dev_err(hba->dev, "%s: failed to disable bkops %d\n", __func__, err); @@ -5463,7 +5468,7 @@ int ufshcd_wb_ctrl(struct ufs_hba *hba, bool enable) index = ufshcd_wb_get_query_
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-18 10:02, Daejun Park wrote: On 2021-03-17 09:42, Daejun Park wrote: On 2021-03-15 15:23, Can Guo wrote: On 2021-03-15 15:07, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static struct attribute *hpb_dev_param_attrs[] = { +_attr_requeue_timeout_ms.attr, +NULL, +}; + +struct attribute_group ufs_sysfs_hpb_param_group = { +.name = "hpb_param_sysfs", +.attrs = hpb_dev_param_attrs, +}; + +static int ufshpb_pre_req_mempool_init(struct ufshpb_lu *hpb) +{ +struct ufshpb_req *pre_req = NULL; +int qd = hpb->sdev_ufs_lu->queue_depth / 2; +int i, j; + +INIT_LIST_HEAD(>lh_pre_req_free); + +hpb->pre_req = kcalloc(qd, sizeof(struct ufshpb_req), GFP_KERNEL); +hpb->throttle_pre_req = qd; +hpb->num_inflight_pre_req = 0; + +if (!hpb->pre_req) +goto release_mem; + +for (i = 0; i < qd; i++) { +pre_req = hpb->pre_req + i; +INIT_LIST_HEAD(_req->list_req); +pre_req->req = NULL; +pre_req->bio = NULL; Why don't prepare bio as same as wb.m_page? Won't that save more time for ufshpb_issue_pre_req()? It is pre_req pool. So although we prepare bio at this time, it just only for first pre_req. I meant removing the bio_alloc() in ufshpb_issue_pre_req() and bio_put() in ufshpb_pre_req_compl_fn(). bios, in pre_req's case, just hold a page. So, prepare 16 (if queue depth is 32) bios here, just use them along with wb.m_page and call bio_reset() in ufshpb_pre_req_compl_fn(). Shall it work? If it works, you can even have the bio_add_pc_page() called here. Later in ufshpb_execute_pre_req(), you don't need to call ufshpb_pre_req_add_bio_page(), just call ufshpb_prep_entry() once instead - it save many repeated steps for a pre_req, and you don't even need to call bio_reset() in this case, since for a bio, nothing changes after it is binded with a specific page... Hi, Can Guo I tried the idea that you suggested, but it doesn't work properly. This optimization should be done next time for enhancement. Can you elaborate please? Any error seen? Per my understanding, in the case for pre_reqs, a bio is no different from a page. Here it can reserve 16 pages for later use, which can be done the same for bios. I found some problem with re-using pre allocated bio. The following kernel message is related with problem. [2.750530] [ cut here ] [2.751404] WARNING: CPU: 4 PID: 170 at drivers/scsi/scsi_lib.c:1020 scsi_alloc_sgtables+0x253/0x2b0 [2.753054] Modules linked in: [2.753651] CPU: 4 PID: 170 Comm: mount Not tainted 5.12.0-rc1+ #331 [2.754752] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [2.756813] RIP: 0010:scsi_alloc_sgtables+0x253/0x2b0 [2.757699] Code: 85 c0 74 19 41 0f b6 44 24 18 8d 50 e0 83 fa 03 76 30 41 bd 01 00 00 00 e9 1f fe ff ff be 01 00 00 00 45 31 ed e9 19 fe ff ff <0f> 0b b8 0a f [2.761021] RSP: 0018:b06e0027f538 EFLAGS: 00010246 [2.761902] RAX: RBX: 9c3a42d424d0 RCX: b06e0027f5e0 [2.763184] RDX: 9c3a42d426a8 RSI: RDI: 9c3a42d424d0 [2.764446] RBP: b06e0027f570 R08: R09: [2.765704] R10: 8eb0dda0 R11: fffb7675 R12: 9c3a42d423c0 [2.766976] R13: R14: 9c3a41bed000 R15: 9c3a420f4000 [2.768225] FS: 7f42d1eab100() GS:9c3b77c0() knlGS: [2.769666] CS: 0010 DS: ES: CR0: 80050033 [2.770719] CR2: 7f42d1ac1000 CR3: 000104bee006 CR4: 00370ee0 [2.771997] DR0: DR1: DR2: [2.773288] DR3: DR6: fffe0ff0 DR7: 0400 [2.774543] Call Trace: [2.775092] scsi_queue_rq+0x9b6/0xb20 [2.775754] __blk_mq_try_issue_directly+0x150/0x1f0 [2.776636] blk_mq_request_issue_directly+0x49/0x80 [2.777616] blk_insert_cloned_request+0x85/0xd0 [2.778470] ufshpb_prep.cold+0x793/0x7be [2.779179] ufshcd_queuecommand+0x114/0x690 [2.779986] scsi_queue_rq+0x38a/0xb20 [2.780755] blk_mq_dispatch_rq_list+0x13d/0x760 [2.781605] ? dd_dispatch_reques
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-18 10:02, Daejun Park wrote: On 2021-03-17 09:42, Daejun Park wrote: On 2021-03-15 15:23, Can Guo wrote: On 2021-03-15 15:07, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static struct attribute *hpb_dev_param_attrs[] = { +_attr_requeue_timeout_ms.attr, +NULL, +}; + +struct attribute_group ufs_sysfs_hpb_param_group = { +.name = "hpb_param_sysfs", +.attrs = hpb_dev_param_attrs, +}; + +static int ufshpb_pre_req_mempool_init(struct ufshpb_lu *hpb) +{ +struct ufshpb_req *pre_req = NULL; +int qd = hpb->sdev_ufs_lu->queue_depth / 2; +int i, j; + +INIT_LIST_HEAD(>lh_pre_req_free); + +hpb->pre_req = kcalloc(qd, sizeof(struct ufshpb_req), GFP_KERNEL); +hpb->throttle_pre_req = qd; +hpb->num_inflight_pre_req = 0; + +if (!hpb->pre_req) +goto release_mem; + +for (i = 0; i < qd; i++) { +pre_req = hpb->pre_req + i; +INIT_LIST_HEAD(_req->list_req); +pre_req->req = NULL; +pre_req->bio = NULL; Why don't prepare bio as same as wb.m_page? Won't that save more time for ufshpb_issue_pre_req()? It is pre_req pool. So although we prepare bio at this time, it just only for first pre_req. I meant removing the bio_alloc() in ufshpb_issue_pre_req() and bio_put() in ufshpb_pre_req_compl_fn(). bios, in pre_req's case, just hold a page. So, prepare 16 (if queue depth is 32) bios here, just use them along with wb.m_page and call bio_reset() in ufshpb_pre_req_compl_fn(). Shall it work? If it works, you can even have the bio_add_pc_page() called here. Later in ufshpb_execute_pre_req(), you don't need to call ufshpb_pre_req_add_bio_page(), just call ufshpb_prep_entry() once instead - it save many repeated steps for a pre_req, and you don't even need to call bio_reset() in this case, since for a bio, nothing changes after it is binded with a specific page... Hi, Can Guo I tried the idea that you suggested, but it doesn't work properly. This optimization should be done next time for enhancement. Can you elaborate please? Any error seen? Per my understanding, in the case for pre_reqs, a bio is no different from a page. Here it can reserve 16 pages for later use, which can be done the same for bios. I found some problem with re-using pre allocated bio. The following kernel message is related with problem. [2.750530] [ cut here ] [2.751404] WARNING: CPU: 4 PID: 170 at drivers/scsi/scsi_lib.c:1020 scsi_alloc_sgtables+0x253/0x2b0 [2.753054] Modules linked in: [2.753651] CPU: 4 PID: 170 Comm: mount Not tainted 5.12.0-rc1+ #331 [2.754752] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [2.756813] RIP: 0010:scsi_alloc_sgtables+0x253/0x2b0 [2.757699] Code: 85 c0 74 19 41 0f b6 44 24 18 8d 50 e0 83 fa 03 76 30 41 bd 01 00 00 00 e9 1f fe ff ff be 01 00 00 00 45 31 ed e9 19 fe ff ff <0f> 0b b8 0a f [2.761021] RSP: 0018:b06e0027f538 EFLAGS: 00010246 [2.761902] RAX: RBX: 9c3a42d424d0 RCX: b06e0027f5e0 [2.763184] RDX: 9c3a42d426a8 RSI: RDI: 9c3a42d424d0 [2.764446] RBP: b06e0027f570 R08: R09: [2.765704] R10: 8eb0dda0 R11: fffb7675 R12: 9c3a42d423c0 [2.766976] R13: R14: 9c3a41bed000 R15: 9c3a420f4000 [2.768225] FS: 7f42d1eab100() GS:9c3b77c0() knlGS: [2.769666] CS: 0010 DS: ES: CR0: 80050033 [2.770719] CR2: 7f42d1ac1000 CR3: 000104bee006 CR4: 00370ee0 [2.771997] DR0: DR1: DR2: [2.773288] DR3: DR6: fffe0ff0 DR7: 0400 [2.774543] Call Trace: [2.775092] scsi_queue_rq+0x9b6/0xb20 [2.775754] __blk_mq_try_issue_directly+0x150/0x1f0 [2.776636] blk_mq_request_issue_directly+0x49/0x80 [2.777616] blk_insert_cloned_request+0x85/0xd0 [2.778470] ufshpb_prep.cold+0x793/0x7be [2.779179] ufshcd_queuecommand+0x114/0x690 [2.779986] scsi_queue_rq+0x38a/0xb20 [2.780755] blk_mq_dispatch_rq_list+0x13d/0x760 [2.781605] ? dd_dispatch_reques
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
On 2021-03-17 23:46, Avri Altman wrote: >> >> >> >> >> >> Just curious, directly doing below things inside ufshpb_rsp_upiu() >> >> >> does >> >> >> not >> >> >> seem a problem to me, does this really deserve a separate work? >> >> > I don't know, I never even consider of doing this. >> >> > The active region list may contain up to few thousands of regions - >> >> > It is not rare to see configurations that covers the entire device. >> >> > >> >> >> >> Yes, true, it can be a huge list. But what does the ops >> >> "HPB_RSP_DEV_RESET" >> >> really mean? The specs says "Device reset HPB Regions information", >> >> but >> >> I >> >> don't know what is really happening. Could you please elaborate? >> > It means that the device informs the host that the L2P cache is no >> > longer valid. >> > The spec doesn't say what to do in that case. >> >> Then it means that all the clean (without DIRTY flag set) HPB entries >> (ppns) >> in active rgns in host memory side may not be valid to the device >> anymore. >> Please correct me if I am wrong. >> >> > We thought that in host mode, it make sense to update all the active >> > regions. >> >> But current logic does not set the state of the sub-regions (in active >> regions) to >> INVALID, it only marks all active regions as UPDATE. >> >> Although one of subsequent read cmds shall put the sub-region back to >> activate_list, >> ufshpb_test_ppn_dirty() can still return false, thus these read cmds >> still think the >> ppns are valid and they shall move forward to send HPB Write Buffer >> (buffer id = 0x2, >> in case of HPB2.0) and HPB Read cmds. >> >> HPB Read cmds with invalid ppns will be treated as normal Read cmds by >> device as the >> specs says, but what would happen to HPB Write Buffer cmds (buffer id >> = >> 0x2, in case >> of HPB2.0) with invalid ppns? Can this be a real problem? > No need to control the ppn dirty / invalid state for this case. > The device send device reset so it is aware that all the L2P cache is > invalid. > Any HPB_READ is treated like normal READ10. > > Only once HPB-READ-BUFFER is completed, > the device will relate back to the physical address. What about HPB-WRITE-BUFFER (buffer id = 0x2) cmds? Same. Oper 0x2 is a relative simple case. The device is expected to manage some versioning framework not to be "fooled" by erroneous ppn. There are some more challenging races that the device should meet. But I don't find the handling w.r.t this scenario on HPB2.0 specs - how would the device re-act/respond to HPB-WRITE-BUFFER cmds with invalid HPB entries? Could you please point me to relevant section/paragraph? Thanks, Can Guo. Thanks, Avri Thanks, Can Guo. > >> >> > >> > I think I will go with your suggestion. >> > Effectively, in host mode, since it is deactivating "cold" regions, >> > the lru list is kept relatively small, and contains only "hot" regions. >> >> hmm... I don't really have a idea on this, please go with whatever you >> and Daejun think is fine here. > I will take your advice and remove the worker. > > > Thanks, > Avri > >> >> Thanks, >> Can Guo. >> >> > >> > Thanks, >> > Avri >> > >> >> >> >> Thanks, >> >> Can Guo. >> >> >> >> > But yes, I can do that. >> >> > Better to get ack from Daejun first. >> >> > >> >> > Thanks, >> >> > Avri >> >> > >> >> >> >> >> >> Thanks, >> >> >> Can Guo. >> >> >> >> >> >> > +{ >> >> >> > + struct ufshpb_lu *hpb; >> >> >> > + struct victim_select_info *lru_info; >> >> >> > + struct ufshpb_region *rgn; >> >> >> > + unsigned long flags; >> >> >> > + >> >> >> > + hpb = container_of(work, struct ufshpb_lu, >> ufshpb_lun_reset_work); >> >> >> > + >> >> >> > + lru_info = >lru_info; >> >> >> > + >> >> >> > + spin_lock_irqsave(>rgn_state_lock, flags); >> >> >> > + >> >> >> > + list_for_each_entry(rgn, _info->lh_lru_rgn
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
't say what to do in that case. Then it means that all the clean (without DIRTY flag set) HPB entries (ppns) in active rgns in host memory side may not be valid to the device anymore. Please correct me if I am wrong. > We thought that in host mode, it make sense to update all the active > regions. But current logic does not set the state of the sub-regions (in active regions) to INVALID, it only marks all active regions as UPDATE. Although one of subsequent read cmds shall put the sub-region back to activate_list, ufshpb_test_ppn_dirty() can still return false, thus these read cmds still think the ppns are valid and they shall move forward to send HPB Write Buffer (buffer id = 0x2, in case of HPB2.0) and HPB Read cmds. HPB Read cmds with invalid ppns will be treated as normal Read cmds by device as the specs says, but what would happen to HPB Write Buffer cmds (buffer id = 0x2, in case of HPB2.0) with invalid ppns? Can this be a real problem? No need to control the ppn dirty / invalid state for this case. The device send device reset so it is aware that all the L2P cache is invalid. Any HPB_READ is treated like normal READ10. Only once HPB-READ-BUFFER is completed, the device will relate back to the physical address. What about HPB-WRITE-BUFFER (buffer id = 0x2) cmds? Thanks, Can Guo. > > I think I will go with your suggestion. > Effectively, in host mode, since it is deactivating "cold" regions, > the lru list is kept relatively small, and contains only "hot" regions. hmm... I don't really have a idea on this, please go with whatever you and Daejun think is fine here. I will take your advice and remove the worker. Thanks, Avri Thanks, Can Guo. > > Thanks, > Avri > >> >> Thanks, >> Can Guo. >> >> > But yes, I can do that. >> > Better to get ack from Daejun first. >> > >> > Thanks, >> > Avri >> > >> >> >> >> Thanks, >> >> Can Guo. >> >> >> >> > +{ >> >> > + struct ufshpb_lu *hpb; >> >> > + struct victim_select_info *lru_info; >> >> > + struct ufshpb_region *rgn; >> >> > + unsigned long flags; >> >> > + >> >> > + hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); >> >> > + >> >> > + lru_info = >lru_info; >> >> > + >> >> > + spin_lock_irqsave(>rgn_state_lock, flags); >> >> > + >> >> > + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) >> >> > + set_bit(RGN_FLAG_UPDATE, >rgn_flags); >> >> > + >> >> > + spin_unlock_irqrestore(>rgn_state_lock, flags); >> >> > +} >> >> > + >> >> > static void ufshpb_normalization_work_handler(struct work_struct >> >> > *work) >> >> > { >> >> > struct ufshpb_lu *hpb; >> >> > @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct >> >> > ufs_hba *hba, struct ufshpb_lu *hpb) >> >> > } else { >> >> > rgn->rgn_state = HPB_RGN_INACTIVE; >> >> > } >> >> > + >> >> > + rgn->rgn_flags = 0; >> >> > } >> >> > >> >> > return 0; >> >> > @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba >> >> > *hba, struct ufshpb_lu *hpb) >> >> > INIT_LIST_HEAD(>list_hpb_lu); >> >> > >> >> > INIT_WORK(>map_work, ufshpb_map_work_handler); >> >> > - if (hpb->is_hcm) >> >> > + if (hpb->is_hcm) { >> >> > INIT_WORK(>ufshpb_normalization_work, >> >> > ufshpb_normalization_work_handler); >> >> > + INIT_WORK(>ufshpb_lun_reset_work, >> >> > + ufshpb_reset_work_handler); >> >> > + } >> >> > >> >> > hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", >> >> > sizeof(struct ufshpb_req), 0, 0, NULL); >> >> > @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct >> >> > ufshpb_lu *hpb) >> >> > >> >> > static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) >> >> > { >> >> > - if (hpb->is_hcm) >> >> > + if (hpb->is_hcm) { >> >> > + cancel_work_sync(>ufshpb_lun_reset_work); >> >> > cancel_work_sync(>ufshpb_normalization_work); >> >> > + } >> >> > cancel_work_sync(>map_work); >> >> > } >> >> > >> >> > diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h >> >> > index 84598a317897..37c1b0ea0c0a 100644 >> >> > --- a/drivers/scsi/ufs/ufshpb.h >> >> > +++ b/drivers/scsi/ufs/ufshpb.h >> >> > @@ -121,6 +121,7 @@ struct ufshpb_region { >> >> > struct list_head list_lru_rgn; >> >> > unsigned long rgn_flags; >> >> > #define RGN_FLAG_DIRTY 0 >> >> > +#define RGN_FLAG_UPDATE 1 >> >> > >> >> > /* region reads - for host mode */ >> >> > spinlock_t rgn_lock; >> >> > @@ -217,6 +218,7 @@ struct ufshpb_lu { >> >> > /* for selecting victim */ >> >> > struct victim_select_info lru_info; >> >> > struct work_struct ufshpb_normalization_work; >> >> > + struct work_struct ufshpb_lun_reset_work; >> >> > >> >> > /* pinned region information */ >> >> > u32 lu_pinned_start;
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
On 2021-03-17 20:22, Avri Altman wrote: On 2021-03-17 19:23, Avri Altman wrote: >> >> On 2021-03-02 21:24, Avri Altman wrote: >> > The spec does not define what is the host's recommended response when >> > the device send hpb dev reset response (oper 0x2). >> > >> > We will update all active hpb regions: mark them and do that on the >> > next >> > read. >> > >> > Signed-off-by: Avri Altman >> > --- >> > drivers/scsi/ufs/ufshpb.c | 47 >> --- >> > drivers/scsi/ufs/ufshpb.h | 2 ++ >> > 2 files changed, 46 insertions(+), 3 deletions(-) >> > >> > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c >> > index 0744feb4d484..0034fa03fdc6 100644 >> > --- a/drivers/scsi/ufs/ufshpb.c >> > +++ b/drivers/scsi/ufs/ufshpb.c >> > @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct >> > ufshcd_lrb *lrbp) >> > if (rgn->reads == ACTIVATION_THRESHOLD) >> > activate = true; >> > spin_unlock_irqrestore(>rgn_lock, flags); >> > - if (activate) { >> > + if (activate || >> > + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { Other than this place, do we also need to clear this bit in places like ufshpb_map_req_compl_fn() and/or ufshpb_cleanup_lru_info()? Otherwise, this flag may be left there even after the rgn is inactivated. >> > spin_lock_irqsave(>rsp_list_lock, flags); >> > ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); >> > hpb->stats.rb_active_cnt++; >> > @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, >> > struct ufshcd_lrb *lrbp) >> > case HPB_RSP_DEV_RESET: >> > dev_warn(>sdev_ufs_lu->sdev_dev, >> >"UFS device lost HPB information during PM.\n"); >> > + >> > + if (hpb->is_hcm) { >> > + struct scsi_device *sdev; >> > + >> > + __shost_for_each_device(sdev, hba->host) { >> > + struct ufshpb_lu *h = sdev->hostdata; >> > + >> > + if (!h) >> > + continue; >> > + >> > + schedule_work(>ufshpb_lun_reset_work); >> > + } >> > + } >> > + >> > break; >> > default: >> > dev_notice(>sdev_ufs_lu->sdev_dev, >> > @@ -1594,6 +1609,25 @@ static void >> > ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) >> > spin_unlock_irqrestore(>rsp_list_lock, flags); >> > } >> > >> > +static void ufshpb_reset_work_handler(struct work_struct *work) >> >> Just curious, directly doing below things inside ufshpb_rsp_upiu() >> does >> not >> seem a problem to me, does this really deserve a separate work? > I don't know, I never even consider of doing this. > The active region list may contain up to few thousands of regions - > It is not rare to see configurations that covers the entire device. > Yes, true, it can be a huge list. But what does the ops "HPB_RSP_DEV_RESET" really mean? The specs says "Device reset HPB Regions information", but I don't know what is really happening. Could you please elaborate? It means that the device informs the host that the L2P cache is no longer valid. The spec doesn't say what to do in that case. Then it means that all the clean (without DIRTY flag set) HPB entries (ppns) in active rgns in host memory side may not be valid to the device anymore. Please correct me if I am wrong. We thought that in host mode, it make sense to update all the active regions. But current logic does not set the state of the sub-regions (in active regions) to INVALID, it only marks all active regions as UPDATE. Although one of subsequent read cmds shall put the sub-region back to activate_list, ufshpb_test_ppn_dirty() can still return false, thus these read cmds still think the ppns are valid and they shall move forward to send HPB Write Buffer (buffer id = 0x2, in case of HPB2.0) and HPB Read cmds. HPB Read cmds with invalid ppns will be treated as normal Read cmds by device as the specs says, but what would happen to HPB Write Buffer cmds (buffer id = 0x2, in case of HPB2.0) with invalid ppns? Can this be a real problem? I think I will go with your s
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
On 2021-03-17 19:23, Avri Altman wrote: On 2021-03-02 21:24, Avri Altman wrote: > The spec does not define what is the host's recommended response when > the device send hpb dev reset response (oper 0x2). > > We will update all active hpb regions: mark them and do that on the > next > read. > > Signed-off-by: Avri Altman > --- > drivers/scsi/ufs/ufshpb.c | 47 --- > drivers/scsi/ufs/ufshpb.h | 2 ++ > 2 files changed, 46 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c > index 0744feb4d484..0034fa03fdc6 100644 > --- a/drivers/scsi/ufs/ufshpb.c > +++ b/drivers/scsi/ufs/ufshpb.c > @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct > ufshcd_lrb *lrbp) > if (rgn->reads == ACTIVATION_THRESHOLD) > activate = true; > spin_unlock_irqrestore(>rgn_lock, flags); > - if (activate) { > + if (activate || > + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { > spin_lock_irqsave(>rsp_list_lock, flags); > ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); > hpb->stats.rb_active_cnt++; > @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, > struct ufshcd_lrb *lrbp) > case HPB_RSP_DEV_RESET: > dev_warn(>sdev_ufs_lu->sdev_dev, >"UFS device lost HPB information during PM.\n"); > + > + if (hpb->is_hcm) { > + struct scsi_device *sdev; > + > + __shost_for_each_device(sdev, hba->host) { > + struct ufshpb_lu *h = sdev->hostdata; > + > + if (!h) > + continue; > + > + schedule_work(>ufshpb_lun_reset_work); > + } > + } > + > break; > default: > dev_notice(>sdev_ufs_lu->sdev_dev, > @@ -1594,6 +1609,25 @@ static void > ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) > spin_unlock_irqrestore(>rsp_list_lock, flags); > } > > +static void ufshpb_reset_work_handler(struct work_struct *work) Just curious, directly doing below things inside ufshpb_rsp_upiu() does not seem a problem to me, does this really deserve a separate work? I don't know, I never even consider of doing this. The active region list may contain up to few thousands of regions - It is not rare to see configurations that covers the entire device. Yes, true, it can be a huge list. But what does the ops "HPB_RSP_DEV_RESET" really mean? The specs says "Device reset HPB Regions information", but I don't know what is really happening. Could you please elaborate? Thanks, Can Guo. But yes, I can do that. Better to get ack from Daejun first. Thanks, Avri Thanks, Can Guo. > +{ > + struct ufshpb_lu *hpb; > + struct victim_select_info *lru_info; > + struct ufshpb_region *rgn; > + unsigned long flags; > + > + hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); > + > + lru_info = >lru_info; > + > + spin_lock_irqsave(>rgn_state_lock, flags); > + > + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) > + set_bit(RGN_FLAG_UPDATE, >rgn_flags); > + > + spin_unlock_irqrestore(>rgn_state_lock, flags); > +} > + > static void ufshpb_normalization_work_handler(struct work_struct > *work) > { > struct ufshpb_lu *hpb; > @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct > ufs_hba *hba, struct ufshpb_lu *hpb) > } else { > rgn->rgn_state = HPB_RGN_INACTIVE; > } > + > + rgn->rgn_flags = 0; > } > > return 0; > @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba > *hba, struct ufshpb_lu *hpb) > INIT_LIST_HEAD(>list_hpb_lu); > > INIT_WORK(>map_work, ufshpb_map_work_handler); > - if (hpb->is_hcm) > + if (hpb->is_hcm) { > INIT_WORK(>ufshpb_normalization_work, > ufshpb_normalization_work_handler); > + INIT_WORK(>ufshpb_lun_reset_work, > + ufshpb_reset_work_handler); > + } > > hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", > sizeof(struct ufshpb_req), 0, 0, NULL); > @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct > ufshpb_lu *hpb) > > static void ufshpb
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
On 2021-03-02 21:24, Avri Altman wrote: The spec does not define what is the host's recommended response when the device send hpb dev reset response (oper 0x2). We will update all active hpb regions: mark them and do that on the next read. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 47 --- drivers/scsi/ufs/ufshpb.h | 2 ++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 0744feb4d484..0034fa03fdc6 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) if (rgn->reads == ACTIVATION_THRESHOLD) activate = true; spin_unlock_irqrestore(>rgn_lock, flags); - if (activate) { + if (activate || + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { spin_lock_irqsave(>rsp_list_lock, flags); ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); hpb->stats.rb_active_cnt++; @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case HPB_RSP_DEV_RESET: dev_warn(>sdev_ufs_lu->sdev_dev, "UFS device lost HPB information during PM.\n"); + + if (hpb->is_hcm) { + struct scsi_device *sdev; + + __shost_for_each_device(sdev, hba->host) { + struct ufshpb_lu *h = sdev->hostdata; + + if (!h) + continue; + + schedule_work(>ufshpb_lun_reset_work); + } + } + break; default: dev_notice(>sdev_ufs_lu->sdev_dev, @@ -1594,6 +1609,25 @@ static void ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) spin_unlock_irqrestore(>rsp_list_lock, flags); } +static void ufshpb_reset_work_handler(struct work_struct *work) Just curious, directly doing below things inside ufshpb_rsp_upiu() does not seem a problem to me, does this really deserve a separate work? Thanks, Can Guo. +{ + struct ufshpb_lu *hpb; + struct victim_select_info *lru_info; + struct ufshpb_region *rgn; + unsigned long flags; + + hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); + + lru_info = >lru_info; + + spin_lock_irqsave(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) + set_bit(RGN_FLAG_UPDATE, >rgn_flags); + + spin_unlock_irqrestore(>rgn_state_lock, flags); +} + static void ufshpb_normalization_work_handler(struct work_struct *work) { struct ufshpb_lu *hpb; @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } else { rgn->rgn_state = HPB_RGN_INACTIVE; } + + rgn->rgn_flags = 0; } return 0; @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_hpb_lu); INIT_WORK(>map_work, ufshpb_map_work_handler); - if (hpb->is_hcm) + if (hpb->is_hcm) { INIT_WORK(>ufshpb_normalization_work, ufshpb_normalization_work_handler); + INIT_WORK(>ufshpb_lun_reset_work, + ufshpb_reset_work_handler); + } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", sizeof(struct ufshpb_req), 0, 0, NULL); @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct ufshpb_lu *hpb) static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) { - if (hpb->is_hcm) + if (hpb->is_hcm) { + cancel_work_sync(>ufshpb_lun_reset_work); cancel_work_sync(>ufshpb_normalization_work); + } cancel_work_sync(>map_work); } diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 84598a317897..37c1b0ea0c0a 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -121,6 +121,7 @@ struct ufshpb_region { struct list_head list_lru_rgn; unsigned long rgn_flags; #define RGN_FLAG_DIRTY 0 +#define RGN_FLAG_UPDATE 1 /* region reads - for host mode */ spinlock_t rgn_lock; @@ -217,6 +218,7 @@ struct ufshpb_lu { /* for selecting victim */ struct victim_select_info lru_info; struct work_struct ufshpb_normalization_work; + struct work_struct ufshpb_lun_reset_work; /* pinned region information */ u32 lu_pinned_start;
Re: [PATCH v5 05/10] scsi: ufshpb: Region inactivation in host mode
On 2021-03-17 13:19, Daejun Park wrote: >> --- >> drivers/scsi/ufs/ufshpb.c | 14 ++ >> drivers/scsi/ufs/ufshpb.h | 1 + >> 2 files changed, 15 insertions(+) >> >> diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c >> index 6f4fd22eaf2f..0744feb4d484 100644 >> --- a/drivers/scsi/ufs/ufshpb.c >> +++ b/drivers/scsi/ufs/ufshpb.c >> @@ -907,6 +907,7 @@ static int ufshpb_execute_umap_req(struct >> ufshpb_lu *hpb, >> >> blk_execute_rq_nowait(q, NULL, req, 1, ufshpb_umap_req_compl_fn); >> >> +hpb->stats.umap_req_cnt++; >> return 0; >> } >> >> @@ -1103,6 +1104,12 @@ static int ufshpb_issue_umap_req(struct >> ufshpb_lu *hpb, >> return -EAGAIN; >> } >> >> +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, >> +struct ufshpb_region *rgn) >> +{ >> +return ufshpb_issue_umap_req(hpb, rgn); >> +} >> + >> static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) >> { >> return ufshpb_issue_umap_req(hpb, NULL); >> @@ -1115,6 +1122,10 @@ static void __ufshpb_evict_region(struct >> ufshpb_lu *hpb, >> struct ufshpb_subregion *srgn; >> int srgn_idx; >> >> + >> +if (hpb->is_hcm && ufshpb_issue_umap_single_req(hpb, rgn)) > > __ufshpb_evict_region() is called with rgn_state_lock held and IRQ > disabled, > when ufshpb_issue_umap_single_req() invokes blk_execute_rq_nowait(), > below > warning shall pop up every time, fix it? > > void blk_execute_rq_nowait(struct request_queue *q, struct gendisk > *bd_disk, > struct request *rq, int at_head, > rq_end_io_fn *done) > { > WARN_ON(irqs_disabled()); > ... > Moreover, since we are here with rgn_state_lock held and IRQ disabled, in ufshpb_get_req(), rq = kmem_cache_alloc(hpb->map_req_cache, GFP_KERNEL) has the GFP_KERNEL flag, scheduling while atomic??? I think your comment applies to ufshpb_issue_umap_all_req as well, Which is called from slave_configure/scsi_add_lun. Since the host-mode series is utilizing the framework laid by the device-mode, Maybe you can add this comment to Daejun's last version? Hi Avri, Can Guo I think ufshpb_issue_umap_single_req() can be moved to end of ufshpb_evict_region(). Then we can avoid rgn_state_lock when it sends unmap command. I am not the expert here, please you two fix it. I am just reporting what can be wrong. Anyways, ufshpb_issue_umap_single_req() should not be called with rgn_state_lock held - think about below (another deadly) scenario. lock(rgn_state_lock) ufshpb_issue_umap_single_req() ufshpb_prep() lock(rgn_state_lock) <-- recursive spin_lock BTW, @Daejun shouldn't we stop passthrough cmds from stepping into ufshpb_prep()? In current code, you are trying to use below check to block cmds other than write/discard/read, but a passthrough cmd can not be blocked by the check. if (!ufshpb_is_write_or_discard_cmd(cmd) && !ufshpb_is_read_cmd(cmd) ) return 0; I found this problem too. I fixed it and submit next patch. You mean in V30, which has not been uploaded yet, right? Thanks, Can Guo. if (blk_rq_is_scsi(cmd->request) || (!ufshpb_is_write_or_discard_cmd(cmd) && !ufshpb_is_read_cmd(cmd))) return 0; Thanks, Daejun Thanks, Can Guo. Thanks, Daejun Thanks, Avri Can Guo. > Thanks. > Can Guo. > >> +return; >> + >> lru_info = >lru_info; >> >> dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", >> rgn->rgn_idx); >> @@ -1855,6 +1866,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); >> ufshpb_sysfs_attr_show_func(rb_active_cnt); >> ufshpb_sysfs_attr_show_func(rb_inactive_cnt); >> ufshpb_sysfs_attr_show_func(map_req_cnt); >> +ufshpb_sysfs_attr_show_func(umap_req_cnt); >> >> static struct attribute *hpb_dev_stat_attrs[] = { >> _attr_hit_cnt.attr, >> @@ -1863,6 +1875,7 @@ static struct attribute *hpb_dev_stat_attrs[] = >> { >> _attr_rb_active_cnt.attr, >> _attr_rb_inactive_cnt.attr, >> _attr_map_req_cnt.attr, >> +_attr_umap_req_cnt.attr, >> NULL, >> }; >> >> @@ -1978,6 +1991,7 @@ static void ufshpb_stat_init(struct ufshpb_lu >> *hpb) >> hpb->stats.rb_active_cnt = 0; >> hpb->stats.rb_inactive_cnt = 0; >> hpb->stats.map_req_cnt = 0; >> +hpb->stats.umap_req_cnt = 0; >> } >> >> static void ufshpb_param_init(struct ufshpb_lu *hpb) >> diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h >> index bd4308010466..84598a317897 100644 >> --- a/drivers/scsi/ufs/ufshpb.h >> +++ b/drivers/scsi/ufs/ufshpb.h >> @@ -186,6 +186,7 @@ struct ufshpb_stats { >> u64 rb_inactive_cnt; >> u64 map_req_cnt; >> u64 pre_req_cnt; >> +u64 umap_req_cnt; >> }; >> >> struct ufshpb_lu {
Re: [PATCH v5 05/10] scsi: ufshpb: Region inactivation in host mode
On 2021-03-17 10:28, Daejun Park wrote: >> --- >> drivers/scsi/ufs/ufshpb.c | 14 ++ >> drivers/scsi/ufs/ufshpb.h | 1 + >> 2 files changed, 15 insertions(+) >> >> diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c >> index 6f4fd22eaf2f..0744feb4d484 100644 >> --- a/drivers/scsi/ufs/ufshpb.c >> +++ b/drivers/scsi/ufs/ufshpb.c >> @@ -907,6 +907,7 @@ static int ufshpb_execute_umap_req(struct >> ufshpb_lu *hpb, >> >> blk_execute_rq_nowait(q, NULL, req, 1, ufshpb_umap_req_compl_fn); >> >> +hpb->stats.umap_req_cnt++; >> return 0; >> } >> >> @@ -1103,6 +1104,12 @@ static int ufshpb_issue_umap_req(struct >> ufshpb_lu *hpb, >> return -EAGAIN; >> } >> >> +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, >> +struct ufshpb_region *rgn) >> +{ >> +return ufshpb_issue_umap_req(hpb, rgn); >> +} >> + >> static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) >> { >> return ufshpb_issue_umap_req(hpb, NULL); >> @@ -1115,6 +1122,10 @@ static void __ufshpb_evict_region(struct >> ufshpb_lu *hpb, >> struct ufshpb_subregion *srgn; >> int srgn_idx; >> >> + >> +if (hpb->is_hcm && ufshpb_issue_umap_single_req(hpb, rgn)) > > __ufshpb_evict_region() is called with rgn_state_lock held and IRQ > disabled, > when ufshpb_issue_umap_single_req() invokes blk_execute_rq_nowait(), > below > warning shall pop up every time, fix it? > > void blk_execute_rq_nowait(struct request_queue *q, struct gendisk > *bd_disk, > struct request *rq, int at_head, > rq_end_io_fn *done) > { > WARN_ON(irqs_disabled()); > ... > Moreover, since we are here with rgn_state_lock held and IRQ disabled, in ufshpb_get_req(), rq = kmem_cache_alloc(hpb->map_req_cache, GFP_KERNEL) has the GFP_KERNEL flag, scheduling while atomic??? I think your comment applies to ufshpb_issue_umap_all_req as well, Which is called from slave_configure/scsi_add_lun. Since the host-mode series is utilizing the framework laid by the device-mode, Maybe you can add this comment to Daejun's last version? Hi Avri, Can Guo I think ufshpb_issue_umap_single_req() can be moved to end of ufshpb_evict_region(). Then we can avoid rgn_state_lock when it sends unmap command. I am not the expert here, please you two fix it. I am just reporting what can be wrong. Anyways, ufshpb_issue_umap_single_req() should not be called with rgn_state_lock held - think about below (another deadly) scenario. lock(rgn_state_lock) ufshpb_issue_umap_single_req() ufshpb_prep() lock(rgn_state_lock) <-- recursive spin_lock BTW, @Daejun shouldn't we stop passthrough cmds from stepping into ufshpb_prep()? In current code, you are trying to use below check to block cmds other than write/discard/read, but a passthrough cmd can not be blocked by the check. if (!ufshpb_is_write_or_discard_cmd(cmd) && !ufshpb_is_read_cmd(cmd)) return 0; Thanks, Can Guo. Thanks, Daejun Thanks, Avri Can Guo. > Thanks. > Can Guo. > >> +return; >> + >> lru_info = >lru_info; >> >> dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", >> rgn->rgn_idx); >> @@ -1855,6 +1866,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); >> ufshpb_sysfs_attr_show_func(rb_active_cnt); >> ufshpb_sysfs_attr_show_func(rb_inactive_cnt); >> ufshpb_sysfs_attr_show_func(map_req_cnt); >> +ufshpb_sysfs_attr_show_func(umap_req_cnt); >> >> static struct attribute *hpb_dev_stat_attrs[] = { >> _attr_hit_cnt.attr, >> @@ -1863,6 +1875,7 @@ static struct attribute *hpb_dev_stat_attrs[] = >> { >> _attr_rb_active_cnt.attr, >> _attr_rb_inactive_cnt.attr, >> _attr_map_req_cnt.attr, >> +_attr_umap_req_cnt.attr, >> NULL, >> }; >> >> @@ -1978,6 +1991,7 @@ static void ufshpb_stat_init(struct ufshpb_lu >> *hpb) >> hpb->stats.rb_active_cnt = 0; >> hpb->stats.rb_inactive_cnt = 0; >> hpb->stats.map_req_cnt = 0; >> +hpb->stats.umap_req_cnt = 0; >> } >> >> static void ufshpb_param_init(struct ufshpb_lu *hpb) >> diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h >> index bd4308010466..84598a317897 100644 >> --- a/drivers/scsi/ufs/ufshpb.h >> +++ b/drivers/scsi/ufs/ufshpb.h >> @@ -186,6 +186,7 @@ struct ufshpb_stats { >> u64 rb_inactive_cnt; >> u64 map_req_cnt; >> u64 pre_req_cnt; >> +u64 umap_req_cnt; >> }; >> >> struct ufshpb_lu {
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-17 09:42, Daejun Park wrote: On 2021-03-15 15:23, Can Guo wrote: On 2021-03-15 15:07, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static struct attribute *hpb_dev_param_attrs[] = { +_attr_requeue_timeout_ms.attr, +NULL, +}; + +struct attribute_group ufs_sysfs_hpb_param_group = { +.name = "hpb_param_sysfs", +.attrs = hpb_dev_param_attrs, +}; + +static int ufshpb_pre_req_mempool_init(struct ufshpb_lu *hpb) +{ +struct ufshpb_req *pre_req = NULL; +int qd = hpb->sdev_ufs_lu->queue_depth / 2; +int i, j; + +INIT_LIST_HEAD(>lh_pre_req_free); + +hpb->pre_req = kcalloc(qd, sizeof(struct ufshpb_req), GFP_KERNEL); +hpb->throttle_pre_req = qd; +hpb->num_inflight_pre_req = 0; + +if (!hpb->pre_req) +goto release_mem; + +for (i = 0; i < qd; i++) { +pre_req = hpb->pre_req + i; +INIT_LIST_HEAD(_req->list_req); +pre_req->req = NULL; +pre_req->bio = NULL; Why don't prepare bio as same as wb.m_page? Won't that save more time for ufshpb_issue_pre_req()? It is pre_req pool. So although we prepare bio at this time, it just only for first pre_req. I meant removing the bio_alloc() in ufshpb_issue_pre_req() and bio_put() in ufshpb_pre_req_compl_fn(). bios, in pre_req's case, just hold a page. So, prepare 16 (if queue depth is 32) bios here, just use them along with wb.m_page and call bio_reset() in ufshpb_pre_req_compl_fn(). Shall it work? If it works, you can even have the bio_add_pc_page() called here. Later in ufshpb_execute_pre_req(), you don't need to call ufshpb_pre_req_add_bio_page(), just call ufshpb_prep_entry() once instead - it save many repeated steps for a pre_req, and you don't even need to call bio_reset() in this case, since for a bio, nothing changes after it is binded with a specific page... Hi, Can Guo I tried the idea that you suggested, but it doesn't work properly. This optimization should be done next time for enhancement. Can you elaborate please? Any error seen? Per my understanding, in the case for pre_reqs, a bio is no different from a page. Here it can reserve 16 pages for later use, which can be done the same for bios. This is not an enhancement, but a doubt - why not? Unless it is not doable. Thanks, Can Guo. Thanks Daejun Can Guo. Thanks, Can Guo. After use it, it should be prepared bio at issue phase. Thanks, Daejun Thanks, Can Guo. + +pre_req->wb.m_page = alloc_page(GFP_KERNEL | __GFP_ZERO); +if (!pre_req->wb.m_page) { +for (j = 0; j < i; j++) + __free_page(hpb->pre_req[j].wb.m_page); + +goto release_mem; +} +list_add_tail(_req->list_req, >lh_pre_req_free); +} + +return 0; +release_mem: +kfree(hpb->pre_req); +return -ENOMEM; +} +
Re: [PATCH v5 07/10] scsi: ufshpb: Add "Cold" regions timer
On 2021-03-16 17:21, Avri Altman wrote: > +static void ufshpb_read_to_handler(struct work_struct *work) > +{ > + struct delayed_work *dwork = to_delayed_work(work); > + struct ufshpb_lu *hpb; > + struct victim_select_info *lru_info; > + struct ufshpb_region *rgn; > + unsigned long flags; > + LIST_HEAD(expired_list); > + > + hpb = container_of(dwork, struct ufshpb_lu, ufshpb_read_to_work); > + > + spin_lock_irqsave(>rgn_state_lock, flags); > + > + lru_info = >lru_info; > + > + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) { > + bool timedout = ktime_after(ktime_get(), rgn->read_timeout); > + > + if (timedout) { > + rgn->read_timeout_expiries--; > + if (is_rgn_dirty(rgn) || > + rgn->read_timeout_expiries == 0) > + list_add(>list_expired_rgn, _list); > + else > + rgn->read_timeout = ktime_add_ms(ktime_get(), > + READ_TO_MS); > + } > + } > + > + spin_unlock_irqrestore(>rgn_state_lock, flags); > + > + list_for_each_entry(rgn, _list, list_expired_rgn) { Here can be problematic - since you don't have the native expired_list initialized before use, if above loop did not insert anything to expired_list, it shall become a dead loop here. Not sure what you meant by native initialization. LIST_HEAD is statically initializing an empty list, resulting the same outcome as INIT_LIST_HEAD. Sorry for making you confused, you should use list_for_each_entry_safe() instead of list_for_each_entry() as you are deleting entries within the loop, otherwise, this can become an infinite loop. Again, have you tested this patch before upload? I am sure this is problematic - when it becomes an inifinite loop, below path will hang... ufshcd_suspend()->ufshpb_suspend()->cancel_jobs()->cancel_delayed_work() And, which lock is protecting rgn->list_expired_rgn? If two read_to_handler works are running in parallel, one can be inserting it to its expired_list while another can be deleting it. The timeout handler, being a delayed work, is meant to run every polling period. Originally, I had it protected from 2 handlers running concurrently, But I removed it following Daejun's comment, which I accepted, Since it is always scheduled using the same polling period. But one can set the delay to 0 through sysfs, right? Thanks, Can Guo. Thanks, Avri Can Guo. > + list_del_init(>list_expired_rgn); > + spin_lock_irqsave(>rsp_list_lock, flags); > + ufshpb_update_inactive_info(hpb, rgn->rgn_idx); > + hpb->stats.rb_inactive_cnt++; > + spin_unlock_irqrestore(>rsp_list_lock, flags); > + } > + > + ufshpb_kick_map_work(hpb); > + > + schedule_delayed_work(>ufshpb_read_to_work, > + msecs_to_jiffies(POLLING_INTERVAL_MS)); > +} > + > static void ufshpb_add_lru_info(struct victim_select_info *lru_info, > struct ufshpb_region *rgn) > { > rgn->rgn_state = HPB_RGN_ACTIVE; > list_add_tail(>list_lru_rgn, _info->lh_lru_rgn); > atomic_inc(_info->active_cnt); > + if (rgn->hpb->is_hcm) { > + rgn->read_timeout = ktime_add_ms(ktime_get(), READ_TO_MS); > + rgn->read_timeout_expiries = READ_TO_EXPIRIES; > + } > } > > static void ufshpb_hit_lru_info(struct victim_select_info *lru_info, > @@ -1813,6 +1865,7 @@ static int ufshpb_alloc_region_tbl(struct > ufs_hba *hba, struct ufshpb_lu *hpb) > > INIT_LIST_HEAD(>list_inact_rgn); > INIT_LIST_HEAD(>list_lru_rgn); > + INIT_LIST_HEAD(>list_expired_rgn); > > if (rgn_idx == hpb->rgns_per_lu - 1) { > srgn_cnt = ((hpb->srgns_per_lu - 1) % > @@ -1834,6 +1887,7 @@ static int ufshpb_alloc_region_tbl(struct > ufs_hba *hba, struct ufshpb_lu *hpb) > } > > rgn->rgn_flags = 0; > + rgn->hpb = hpb; > } > > return 0; > @@ -2053,6 +2107,8 @@ static int ufshpb_lu_hpb_init(struct ufs_hba > *hba, struct ufshpb_lu *hpb) > ufshpb_normalization_work_handler); > INIT_WORK(>ufshpb_lun_reset_work, > ufshpb_reset_work_handler); > + INIT_DELAYED_WORK(>ufshpb_read_to_work, > + ufshpb_read_to_handler); > } > > hpb->map_req_cache = kmem_cache_create(
Re: [PATCH v5 05/10] scsi: ufshpb: Region inactivation in host mode
On 2021-03-16 16:30, Avri Altman wrote: >> --- >> drivers/scsi/ufs/ufshpb.c | 14 ++ >> drivers/scsi/ufs/ufshpb.h | 1 + >> 2 files changed, 15 insertions(+) >> >> diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c >> index 6f4fd22eaf2f..0744feb4d484 100644 >> --- a/drivers/scsi/ufs/ufshpb.c >> +++ b/drivers/scsi/ufs/ufshpb.c >> @@ -907,6 +907,7 @@ static int ufshpb_execute_umap_req(struct >> ufshpb_lu *hpb, >> >> blk_execute_rq_nowait(q, NULL, req, 1, ufshpb_umap_req_compl_fn); >> >> +hpb->stats.umap_req_cnt++; >> return 0; >> } >> >> @@ -1103,6 +1104,12 @@ static int ufshpb_issue_umap_req(struct >> ufshpb_lu *hpb, >> return -EAGAIN; >> } >> >> +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, >> +struct ufshpb_region *rgn) >> +{ >> +return ufshpb_issue_umap_req(hpb, rgn); >> +} >> + >> static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) >> { >> return ufshpb_issue_umap_req(hpb, NULL); >> @@ -1115,6 +1122,10 @@ static void __ufshpb_evict_region(struct >> ufshpb_lu *hpb, >> struct ufshpb_subregion *srgn; >> int srgn_idx; >> >> + >> +if (hpb->is_hcm && ufshpb_issue_umap_single_req(hpb, rgn)) > > __ufshpb_evict_region() is called with rgn_state_lock held and IRQ > disabled, > when ufshpb_issue_umap_single_req() invokes blk_execute_rq_nowait(), > below > warning shall pop up every time, fix it? > > void blk_execute_rq_nowait(struct request_queue *q, struct gendisk > *bd_disk, > struct request *rq, int at_head, > rq_end_io_fn *done) > { > WARN_ON(irqs_disabled()); > ... > Moreover, since we are here with rgn_state_lock held and IRQ disabled, in ufshpb_get_req(), rq = kmem_cache_alloc(hpb->map_req_cache, GFP_KERNEL) has the GFP_KERNEL flag, scheduling while atomic??? I think your comment applies to ufshpb_issue_umap_all_req as well, Which is called from slave_configure/scsi_add_lun. ufshpb_issue_umap_all_req() is not called from atomic contexts, so ufshpb_issue_umap_all_req() is fine. Thanks, Can Guo. Since the host-mode series is utilizing the framework laid by the device-mode, Maybe you can add this comment to Daejun's last version? Thanks, Avri Can Guo. > Thanks. > Can Guo. > >> +return; >> + >> lru_info = >lru_info; >> >> dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", >> rgn->rgn_idx); >> @@ -1855,6 +1866,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); >> ufshpb_sysfs_attr_show_func(rb_active_cnt); >> ufshpb_sysfs_attr_show_func(rb_inactive_cnt); >> ufshpb_sysfs_attr_show_func(map_req_cnt); >> +ufshpb_sysfs_attr_show_func(umap_req_cnt); >> >> static struct attribute *hpb_dev_stat_attrs[] = { >> _attr_hit_cnt.attr, >> @@ -1863,6 +1875,7 @@ static struct attribute *hpb_dev_stat_attrs[] = >> { >> _attr_rb_active_cnt.attr, >> _attr_rb_inactive_cnt.attr, >> _attr_map_req_cnt.attr, >> +_attr_umap_req_cnt.attr, >> NULL, >> }; >> >> @@ -1978,6 +1991,7 @@ static void ufshpb_stat_init(struct ufshpb_lu >> *hpb) >> hpb->stats.rb_active_cnt = 0; >> hpb->stats.rb_inactive_cnt = 0; >> hpb->stats.map_req_cnt = 0; >> +hpb->stats.umap_req_cnt = 0; >> } >> >> static void ufshpb_param_init(struct ufshpb_lu *hpb) >> diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h >> index bd4308010466..84598a317897 100644 >> --- a/drivers/scsi/ufs/ufshpb.h >> +++ b/drivers/scsi/ufs/ufshpb.h >> @@ -186,6 +186,7 @@ struct ufshpb_stats { >> u64 rb_inactive_cnt; >> u64 map_req_cnt; >> u64 pre_req_cnt; >> +u64 umap_req_cnt; >> }; >> >> struct ufshpb_lu {
Re: [PATCH v5 07/10] scsi: ufshpb: Add "Cold" regions timer
On 2021-03-02 21:25, Avri Altman wrote: In order not to hang on to “cold” regions, we shall inactivate a region that has no READ access for a predefined amount of time - READ_TO_MS. For that purpose we shall monitor the active regions list, polling it on every POLLING_INTERVAL_MS. On timeout expiry we shall add the region to the "to-be-inactivated" list, unless it is clean and did not exhaust its READ_TO_EXPIRIES - another parameter. All this does not apply to pinned regions. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 65 +++ drivers/scsi/ufs/ufshpb.h | 6 2 files changed, 71 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 0034fa03fdc6..89a930e72cff 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -18,6 +18,9 @@ #define ACTIVATION_THRESHOLD 4 /* 4 IOs */ #define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 6) /* 256 IOs */ +#define READ_TO_MS 1000 +#define READ_TO_EXPIRIES 100 +#define POLLING_INTERVAL_MS 200 /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -1024,12 +1027,61 @@ static int ufshpb_check_srgns_issue_state(struct ufshpb_lu *hpb, return 0; } +static void ufshpb_read_to_handler(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct ufshpb_lu *hpb; + struct victim_select_info *lru_info; + struct ufshpb_region *rgn; + unsigned long flags; + LIST_HEAD(expired_list); + + hpb = container_of(dwork, struct ufshpb_lu, ufshpb_read_to_work); + + spin_lock_irqsave(>rgn_state_lock, flags); + + lru_info = >lru_info; + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) { + bool timedout = ktime_after(ktime_get(), rgn->read_timeout); + + if (timedout) { + rgn->read_timeout_expiries--; + if (is_rgn_dirty(rgn) || + rgn->read_timeout_expiries == 0) + list_add(>list_expired_rgn, _list); + else + rgn->read_timeout = ktime_add_ms(ktime_get(), +READ_TO_MS); + } + } + + spin_unlock_irqrestore(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _list, list_expired_rgn) { Here can be problematic - since you don't have the native expired_list initialized before use, if above loop did not insert anything to expired_list, it shall become a dead loop here. And, which lock is protecting rgn->list_expired_rgn? If two read_to_handler works are running in parallel, one can be inserting it to its expired_list while another can be deleting it. Can Guo. + list_del_init(>list_expired_rgn); + spin_lock_irqsave(>rsp_list_lock, flags); + ufshpb_update_inactive_info(hpb, rgn->rgn_idx); + hpb->stats.rb_inactive_cnt++; + spin_unlock_irqrestore(>rsp_list_lock, flags); + } + + ufshpb_kick_map_work(hpb); + + schedule_delayed_work(>ufshpb_read_to_work, + msecs_to_jiffies(POLLING_INTERVAL_MS)); +} + static void ufshpb_add_lru_info(struct victim_select_info *lru_info, struct ufshpb_region *rgn) { rgn->rgn_state = HPB_RGN_ACTIVE; list_add_tail(>list_lru_rgn, _info->lh_lru_rgn); atomic_inc(_info->active_cnt); + if (rgn->hpb->is_hcm) { + rgn->read_timeout = ktime_add_ms(ktime_get(), READ_TO_MS); + rgn->read_timeout_expiries = READ_TO_EXPIRIES; + } } static void ufshpb_hit_lru_info(struct victim_select_info *lru_info, @@ -1813,6 +1865,7 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_inact_rgn); INIT_LIST_HEAD(>list_lru_rgn); + INIT_LIST_HEAD(>list_expired_rgn); if (rgn_idx == hpb->rgns_per_lu - 1) { srgn_cnt = ((hpb->srgns_per_lu - 1) % @@ -1834,6 +1887,7 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } rgn->rgn_flags = 0; + rgn->hpb = hpb; } return 0; @@ -2053,6 +2107,8 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) ufshpb_normalization_work_handler); INIT_WORK(>ufshpb_lun_reset_work, ufshpb_reset_work_handler); + INIT_DELAYED_WORK(>ufshpb_read_to_work, + ufshpb_read_to_handler); } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", @@ -2087,6 +2143,10 @@ st
Re: [PATCH v5 03/10] scsi: ufshpb: Add region's reads counter
On 2021-03-15 17:20, Avri Altman wrote: > + > + if (hpb->is_hcm) { > + spin_lock_irqsave(>rgn_lock, flags); rgn_lock is never used in IRQ contexts, so no need of irqsave and irqrestore everywhere, which can impact performance. Please correct me if I am wrong. Thanks. Will do. Meanwhile, have you ever initialized the rgn_lock before use it??? Yep - forgot to do that here (but not in gs20 and mi10). Thanks. You mean you didn't test this specific series before upload? I haven't moved to the test stage, but this will definitely cause you error... Can Guo. Thanks, Avri Thanks, Can Guo. > + rgn->reads = 0; > + spin_unlock_irqrestore(>rgn_lock, flags); > + } > + > return 0; > } > > if (!ufshpb_is_support_chunk(hpb, transfer_len)) > return 0; > > + if (hpb->is_hcm) { > + bool activate = false; > + /* > + * in host control mode, reads are the main source for > + * activation trials. > + */ > + spin_lock_irqsave(>rgn_lock, flags); > + rgn->reads++; > + if (rgn->reads == ACTIVATION_THRESHOLD) > + activate = true; > + spin_unlock_irqrestore(>rgn_lock, flags); > + if (activate) { > + spin_lock_irqsave(>rsp_list_lock, flags); > + ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); > + hpb->stats.rb_active_cnt++; > + spin_unlock_irqrestore(>rsp_list_lock, flags); > + dev_dbg(>sdev_ufs_lu->sdev_dev, > + "activate region %d-%d\n", rgn_idx, srgn_idx); > + } > + > + /* keep those counters normalized */ > + if (rgn->reads > hpb->entries_per_srgn) > + schedule_work(>ufshpb_normalization_work); > + } > + > spin_lock_irqsave(>rgn_state_lock, flags); > if (ufshpb_test_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, > transfer_len)) { > @@ -745,21 +794,6 @@ static int ufshpb_clear_dirty_bitmap(struct > ufshpb_lu *hpb, > return 0; > } > > -static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int > rgn_idx, > - int srgn_idx) > -{ > - struct ufshpb_region *rgn; > - struct ufshpb_subregion *srgn; > - > - rgn = hpb->rgn_tbl + rgn_idx; > - srgn = rgn->srgn_tbl + srgn_idx; > - > - list_del_init(>list_inact_rgn); > - > - if (list_empty(>list_act_srgn)) > - list_add_tail(>list_act_srgn, >lh_act_srgn); > -} > - > static void ufshpb_update_inactive_info(struct ufshpb_lu *hpb, int > rgn_idx) > { > struct ufshpb_region *rgn; > @@ -1079,6 +1113,14 @@ static void __ufshpb_evict_region(struct > ufshpb_lu *hpb, > > ufshpb_cleanup_lru_info(lru_info, rgn); > > + if (hpb->is_hcm) { > + unsigned long flags; > + > + spin_lock_irqsave(>rgn_lock, flags); > + rgn->reads = 0; > + spin_unlock_irqrestore(>rgn_lock, flags); > + } > + > for_each_sub_region(rgn, srgn_idx, srgn) > ufshpb_purge_active_subregion(hpb, srgn); > } > @@ -1523,6 +1565,31 @@ static void > ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) > spin_unlock_irqrestore(>rsp_list_lock, flags); > } > > +static void ufshpb_normalization_work_handler(struct work_struct > *work) > +{ > + struct ufshpb_lu *hpb; > + int rgn_idx; > + unsigned long flags; > + > + hpb = container_of(work, struct ufshpb_lu, > ufshpb_normalization_work); > + > + for (rgn_idx = 0; rgn_idx < hpb->rgns_per_lu; rgn_idx++) { > + struct ufshpb_region *rgn = hpb->rgn_tbl + rgn_idx; > + > + spin_lock_irqsave(>rgn_lock, flags); > + rgn->reads = (rgn->reads >> 1); > + spin_unlock_irqrestore(>rgn_lock, flags); > + > + if (rgn->rgn_state != HPB_RGN_ACTIVE || rgn->reads) > + continue; > + > + /* if region is active but has no reads - inactivate it */ > + spin_lock(>rsp_list_lock); > + ufshpb_update_inactive_info(hpb, rgn->rgn_idx); > + spin_unlock(>rsp_list_lock); > + } > +} > + > static void ufshpb_map_work_handler(struct work_struct *work) > { > struct ufshpb_lu *hpb = container_of(work, struct
Re: [PATCH v5 03/10] scsi: ufshpb: Add region's reads counter
pb->is_hcm) { + unsigned long flags; + + spin_lock_irqsave(>rgn_lock, flags); + rgn->reads = 0; + spin_unlock_irqrestore(>rgn_lock, flags); + } + for_each_sub_region(rgn, srgn_idx, srgn) ufshpb_purge_active_subregion(hpb, srgn); } @@ -1523,6 +1565,31 @@ static void ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) spin_unlock_irqrestore(>rsp_list_lock, flags); } +static void ufshpb_normalization_work_handler(struct work_struct *work) +{ + struct ufshpb_lu *hpb; + int rgn_idx; + unsigned long flags; + + hpb = container_of(work, struct ufshpb_lu, ufshpb_normalization_work); + + for (rgn_idx = 0; rgn_idx < hpb->rgns_per_lu; rgn_idx++) { + struct ufshpb_region *rgn = hpb->rgn_tbl + rgn_idx; + + spin_lock_irqsave(>rgn_lock, flags); + rgn->reads = (rgn->reads >> 1); + spin_unlock_irqrestore(>rgn_lock, flags); + + if (rgn->rgn_state != HPB_RGN_ACTIVE || rgn->reads) + continue; + + /* if region is active but has no reads - inactivate it */ + spin_lock(>rsp_list_lock); + ufshpb_update_inactive_info(hpb, rgn->rgn_idx); Miss a hpb->stats.rb_inactive_cnt++ here? Thanks, Can Guo. + spin_unlock(>rsp_list_lock); + } +} + static void ufshpb_map_work_handler(struct work_struct *work) { struct ufshpb_lu *hpb = container_of(work, struct ufshpb_lu, map_work); @@ -1913,6 +1980,9 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_hpb_lu); INIT_WORK(>map_work, ufshpb_map_work_handler); + if (hpb->is_hcm) + INIT_WORK(>ufshpb_normalization_work, + ufshpb_normalization_work_handler); hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", sizeof(struct ufshpb_req), 0, 0, NULL); @@ -2012,6 +2082,8 @@ static void ufshpb_discard_rsp_lists(struct ufshpb_lu *hpb) static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) { + if (hpb->is_hcm) + cancel_work_sync(>ufshpb_normalization_work); cancel_work_sync(>map_work); } diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 8119b1a3d1e5..bd4308010466 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -121,6 +121,10 @@ struct ufshpb_region { struct list_head list_lru_rgn; unsigned long rgn_flags; #define RGN_FLAG_DIRTY 0 + + /* region reads - for host mode */ + spinlock_t rgn_lock; + unsigned int reads; }; #define for_each_sub_region(rgn, i, srgn) \ @@ -211,6 +215,7 @@ struct ufshpb_lu { /* for selecting victim */ struct victim_select_info lru_info; + struct work_struct ufshpb_normalization_work; /* pinned region information */ u32 lu_pinned_start;
Re: [PATCH v5 08/10] scsi: ufshpb: Limit the number of inflight map requests
On 2021-03-02 21:25, Avri Altman wrote: in host control mode the host is the originator of map requests. To not in -> In Thanks, Can Guo. flood the device with map requests, use a simple throttling mechanism that limits the number of inflight map requests. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 11 +++ drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 12 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 89a930e72cff..74da69727340 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -21,6 +21,7 @@ #define READ_TO_MS 1000 #define READ_TO_EXPIRIES 100 #define POLLING_INTERVAL_MS 200 +#define THROTTLE_MAP_REQ_DEFAULT 1 /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -750,6 +751,14 @@ static struct ufshpb_req *ufshpb_get_map_req(struct ufshpb_lu *hpb, struct ufshpb_req *map_req; struct bio *bio; + if (hpb->is_hcm && + hpb->num_inflight_map_req >= THROTTLE_MAP_REQ_DEFAULT) { + dev_info(>sdev_ufs_lu->sdev_dev, +"map_req throttle. inflight %d throttle %d", +hpb->num_inflight_map_req, THROTTLE_MAP_REQ_DEFAULT); + return NULL; + } + map_req = ufshpb_get_req(hpb, srgn->rgn_idx, REQ_OP_SCSI_IN); if (!map_req) return NULL; @@ -764,6 +773,7 @@ static struct ufshpb_req *ufshpb_get_map_req(struct ufshpb_lu *hpb, map_req->rb.srgn_idx = srgn->srgn_idx; map_req->rb.mctx = srgn->mctx; + hpb->num_inflight_map_req++; return map_req; } @@ -773,6 +783,7 @@ static void ufshpb_put_map_req(struct ufshpb_lu *hpb, { bio_put(map_req->bio); ufshpb_put_req(hpb, map_req); + hpb->num_inflight_map_req--; } static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index b49e9a34267f..d83ab488688a 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -212,6 +212,7 @@ struct ufshpb_lu { struct ufshpb_req *pre_req; int num_inflight_pre_req; int throttle_pre_req; + int num_inflight_map_req; struct list_head lh_pre_req_free; int cur_read_id; int pre_req_min_tr_len;
Re: [PATCH v5 04/10] scsi: ufshpb: Make eviction depends on region's reads
On 2021-03-02 21:24, Avri Altman wrote: In host mode, eviction is considered an extreme measure. verify that the entering region has enough reads, and the exiting region has much less reads. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index a8f8d13af21a..6f4fd22eaf2f 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -17,6 +17,7 @@ #include "../sd.h" #define ACTIVATION_THRESHOLD 4 /* 4 IOs */ +#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 6) /* 256 IOs */ /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -1050,6 +1051,13 @@ static struct ufshpb_region *ufshpb_victim_lru_info(struct ufshpb_lu *hpb) if (ufshpb_check_srgns_issue_state(hpb, rgn)) continue; + /* +* in host control mode, verify that the exiting region +* has less reads +*/ + if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1)) + continue; + victim_rgn = rgn; break; } @@ -1235,7 +1243,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu *hpb, static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) { - struct ufshpb_region *victim_rgn; + struct ufshpb_region *victim_rgn = NULL; struct victim_select_info *lru_info = >lru_info; unsigned long flags; int ret = 0; @@ -1263,6 +1271,16 @@ static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) * because the device could detect this region * by not issuing HPB_READ */ + + /* +* in host control mode, verify that the entering +* region has enough reads +*/ Maybe merge the new comments with the original comments above? Thanks, Can Guo. + if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) { + ret = -EACCES; + goto out; + } + victim_rgn = ufshpb_victim_lru_info(hpb); if (!victim_rgn) { dev_warn(>sdev_ufs_lu->sdev_dev,
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-15 15:23, Can Guo wrote: On 2021-03-15 15:07, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static struct attribute *hpb_dev_param_attrs[] = { +_attr_requeue_timeout_ms.attr, +NULL, +}; + +struct attribute_group ufs_sysfs_hpb_param_group = { +.name = "hpb_param_sysfs", +.attrs = hpb_dev_param_attrs, +}; + +static int ufshpb_pre_req_mempool_init(struct ufshpb_lu *hpb) +{ +struct ufshpb_req *pre_req = NULL; +int qd = hpb->sdev_ufs_lu->queue_depth / 2; +int i, j; + +INIT_LIST_HEAD(>lh_pre_req_free); + +hpb->pre_req = kcalloc(qd, sizeof(struct ufshpb_req), GFP_KERNEL); +hpb->throttle_pre_req = qd; +hpb->num_inflight_pre_req = 0; + +if (!hpb->pre_req) +goto release_mem; + +for (i = 0; i < qd; i++) { +pre_req = hpb->pre_req + i; +INIT_LIST_HEAD(_req->list_req); +pre_req->req = NULL; +pre_req->bio = NULL; Why don't prepare bio as same as wb.m_page? Won't that save more time for ufshpb_issue_pre_req()? It is pre_req pool. So although we prepare bio at this time, it just only for first pre_req. I meant removing the bio_alloc() in ufshpb_issue_pre_req() and bio_put() in ufshpb_pre_req_compl_fn(). bios, in pre_req's case, just hold a page. So, prepare 16 (if queue depth is 32) bios here, just use them along with wb.m_page and call bio_reset() in ufshpb_pre_req_compl_fn(). Shall it work? If it works, you can even have the bio_add_pc_page() called here. Later in ufshpb_execute_pre_req(), you don't need to call ufshpb_pre_req_add_bio_page(), just call ufshpb_prep_entry() once instead - it save many repeated steps for a pre_req, and you don't even need to call bio_reset() in this case, since for a bio, nothing changes after it is binded with a specific page... Can Guo. Thanks, Can Guo. After use it, it should be prepared bio at issue phase. Thanks, Daejun Thanks, Can Guo. + +pre_req->wb.m_page = alloc_page(GFP_KERNEL | __GFP_ZERO); +if (!pre_req->wb.m_page) { +for (j = 0; j < i; j++) + __free_page(hpb->pre_req[j].wb.m_page); + +goto release_mem; +} +list_add_tail(_req->list_req, >lh_pre_req_free); +} + +return 0; +release_mem: +kfree(hpb->pre_req); +return -ENOMEM; +} +
Re: [PATCH v5 05/10] scsi: ufshpb: Region inactivation in host mode
On 2021-03-15 12:02, Can Guo wrote: On 2021-03-02 21:24, Avri Altman wrote: I host mode, the host is expected to send HPB-WRITE-BUFFER with buffer-id = 0x1 when it inactivates a region. Use the map-requests pool as there is no point in assigning a designated cache for umap-requests. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 14 ++ drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 15 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 6f4fd22eaf2f..0744feb4d484 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -907,6 +907,7 @@ static int ufshpb_execute_umap_req(struct ufshpb_lu *hpb, blk_execute_rq_nowait(q, NULL, req, 1, ufshpb_umap_req_compl_fn); + hpb->stats.umap_req_cnt++; return 0; } @@ -1103,6 +1104,12 @@ static int ufshpb_issue_umap_req(struct ufshpb_lu *hpb, return -EAGAIN; } +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, + struct ufshpb_region *rgn) +{ + return ufshpb_issue_umap_req(hpb, rgn); +} + static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) { return ufshpb_issue_umap_req(hpb, NULL); @@ -1115,6 +1122,10 @@ static void __ufshpb_evict_region(struct ufshpb_lu *hpb, struct ufshpb_subregion *srgn; int srgn_idx; + + if (hpb->is_hcm && ufshpb_issue_umap_single_req(hpb, rgn)) __ufshpb_evict_region() is called with rgn_state_lock held and IRQ disabled, when ufshpb_issue_umap_single_req() invokes blk_execute_rq_nowait(), below warning shall pop up every time, fix it? void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk, struct request *rq, int at_head, rq_end_io_fn *done) { WARN_ON(irqs_disabled()); ... Moreover, since we are here with rgn_state_lock held and IRQ disabled, in ufshpb_get_req(), rq = kmem_cache_alloc(hpb->map_req_cache, GFP_KERNEL) has the GFP_KERNEL flag, scheduling while atomic??? Can Guo. Thanks. Can Guo. + return; + lru_info = >lru_info; dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", rgn->rgn_idx); @@ -1855,6 +1866,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); ufshpb_sysfs_attr_show_func(rb_active_cnt); ufshpb_sysfs_attr_show_func(rb_inactive_cnt); ufshpb_sysfs_attr_show_func(map_req_cnt); +ufshpb_sysfs_attr_show_func(umap_req_cnt); static struct attribute *hpb_dev_stat_attrs[] = { _attr_hit_cnt.attr, @@ -1863,6 +1875,7 @@ static struct attribute *hpb_dev_stat_attrs[] = { _attr_rb_active_cnt.attr, _attr_rb_inactive_cnt.attr, _attr_map_req_cnt.attr, + _attr_umap_req_cnt.attr, NULL, }; @@ -1978,6 +1991,7 @@ static void ufshpb_stat_init(struct ufshpb_lu *hpb) hpb->stats.rb_active_cnt = 0; hpb->stats.rb_inactive_cnt = 0; hpb->stats.map_req_cnt = 0; + hpb->stats.umap_req_cnt = 0; } static void ufshpb_param_init(struct ufshpb_lu *hpb) diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index bd4308010466..84598a317897 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -186,6 +186,7 @@ struct ufshpb_stats { u64 rb_inactive_cnt; u64 map_req_cnt; u64 pre_req_cnt; + u64 umap_req_cnt; }; struct ufshpb_lu {
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-15 15:07, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static struct attribute *hpb_dev_param_attrs[] = { +_attr_requeue_timeout_ms.attr, +NULL, +}; + +struct attribute_group ufs_sysfs_hpb_param_group = { +.name = "hpb_param_sysfs", +.attrs = hpb_dev_param_attrs, +}; + +static int ufshpb_pre_req_mempool_init(struct ufshpb_lu *hpb) +{ +struct ufshpb_req *pre_req = NULL; +int qd = hpb->sdev_ufs_lu->queue_depth / 2; +int i, j; + +INIT_LIST_HEAD(>lh_pre_req_free); + +hpb->pre_req = kcalloc(qd, sizeof(struct ufshpb_req), GFP_KERNEL); +hpb->throttle_pre_req = qd; +hpb->num_inflight_pre_req = 0; + +if (!hpb->pre_req) +goto release_mem; + +for (i = 0; i < qd; i++) { +pre_req = hpb->pre_req + i; +INIT_LIST_HEAD(_req->list_req); +pre_req->req = NULL; +pre_req->bio = NULL; Why don't prepare bio as same as wb.m_page? Won't that save more time for ufshpb_issue_pre_req()? It is pre_req pool. So although we prepare bio at this time, it just only for first pre_req. I meant removing the bio_alloc() in ufshpb_issue_pre_req() and bio_put() in ufshpb_pre_req_compl_fn(). bios, in pre_req's case, just hold a page. So, prepare 16 (if queue depth is 32) bios here, just use them along with wb.m_page and call bio_reset() in ufshpb_pre_req_compl_fn(). Shall it work? Thanks, Can Guo. After use it, it should be prepared bio at issue phase. Thanks, Daejun Thanks, Can Guo. + +pre_req->wb.m_page = alloc_page(GFP_KERNEL | __GFP_ZERO); +if (!pre_req->wb.m_page) { +for (j = 0; j < i; j++) + __free_page(hpb->pre_req[j].wb.m_page); + +goto release_mem; +} +list_add_tail(_req->list_req, >lh_pre_req_free); +} + +return 0; +release_mem: +kfree(hpb->pre_req); +return -ENOMEM; +} +
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
On 2021-03-15 09:34, Can Guo wrote: On 2021-03-02 21:24, Avri Altman wrote: The spec does not define what is the host's recommended response when the device send hpb dev reset response (oper 0x2). We will update all active hpb regions: mark them and do that on the next read. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 47 --- drivers/scsi/ufs/ufshpb.h | 2 ++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 0744feb4d484..0034fa03fdc6 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) if (rgn->reads == ACTIVATION_THRESHOLD) activate = true; spin_unlock_irqrestore(>rgn_lock, flags); - if (activate) { + if (activate || + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { spin_lock_irqsave(>rsp_list_lock, flags); ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); hpb->stats.rb_active_cnt++; @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case HPB_RSP_DEV_RESET: dev_warn(>sdev_ufs_lu->sdev_dev, "UFS device lost HPB information during PM.\n"); + + if (hpb->is_hcm) { + struct scsi_device *sdev; bool need_reset = false; + + __shost_for_each_device(sdev, hba->host) { + struct ufshpb_lu *h = sdev->hostdata; + + if (!h) + continue; + + need_reset = true; + } if (need_reset) schedule_work(>ufshpb_lun_reset_work); At last, scheduling only one reset work shall be enough, otherwise multiple reset work can be flying in parallel, so maybe above changes? Forget about this one, I misunderstood it - reset work is for each ufshpb_lu... Regards, Can Guo. + } + break; default: dev_notice(>sdev_ufs_lu->sdev_dev, @@ -1594,6 +1609,25 @@ static void ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) spin_unlock_irqrestore(>rsp_list_lock, flags); } +static void ufshpb_reset_work_handler(struct work_struct *work) +{ + struct ufshpb_lu *hpb; struct ufshpb_lu *hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); + struct victim_select_info *lru_info; struct victim_select_info *lru_info = >lru_info; This can save some lines. Thanks, Can Guo. + struct ufshpb_region *rgn; + unsigned long flags; + + spin_lock_irqsave(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) + set_bit(RGN_FLAG_UPDATE, >rgn_flags); + + spin_unlock_irqrestore(>rgn_state_lock, flags); +} + static void ufshpb_normalization_work_handler(struct work_struct *work) { struct ufshpb_lu *hpb; @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } else { rgn->rgn_state = HPB_RGN_INACTIVE; } + + rgn->rgn_flags = 0; } return 0; @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_hpb_lu); INIT_WORK(>map_work, ufshpb_map_work_handler); - if (hpb->is_hcm) + if (hpb->is_hcm) { INIT_WORK(>ufshpb_normalization_work, ufshpb_normalization_work_handler); + INIT_WORK(>ufshpb_lun_reset_work, + ufshpb_reset_work_handler); + } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", sizeof(struct ufshpb_req), 0, 0, NULL); @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct ufshpb_lu *hpb) static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) { - if (hpb->is_hcm) + if (hpb->is_hcm) { + cancel_work_sync(>ufshpb_lun_reset_work); cancel_work_sync(>ufshpb_normalization_work); + } cancel_work_sync(>map_work); } diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 84598a317897..37c1b0ea0c0a 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -121,6 +121,7 @@ struct ufshpb_region { struct list_head list_lru_rgn; unsigned long rgn_flags; #define RGN_FLAG_DIRTY 0 +#define RGN_FLAG_UPDATE 1 /* regio
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-15 09:31, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static int ufshpb_issue_pre_req(struct ufshpb_lu *hpb, struct scsi_cmnd *cmd, + int *read_id) +{ + struct ufshpb_req *pre_req; + struct request *req = NULL; + struct bio *bio = NULL; + unsigned long flags; + int _read_id; + int ret = 0; + + req = blk_get_request(cmd->device->request_queue, To keep symmetry with ufshpb_get_req(), can we use hpb->sdev_ufs_lu->request_queue? Thanks, Can Guo. + REQ_OP_SCSI_OUT | REQ_SYNC, BLK_MQ_REQ_NOWAIT); + if (IS_ERR(req)) + return -EAGAIN; + + bio = bio_alloc(GFP_ATOMIC, 1); + if (!bio) { + blk_put_request(req); + return -EAGAIN; + } + + spin_lock_irqsave(>rgn_state_lock, flags); + pre_req = ufshpb_get_pre_req(hpb); + if (!pre_req) { + ret = -EAGAIN; + goto unlock_out; + } + _read_id = ufshpb_get_read_id(hpb); + spin_unlock_irqrestore(>rgn_state_lock, flags); + + pre_req->req = req; + pre_req->bio = bio; + + ret = ufshpb_execute_pre_req(hpb, cmd, pre_req, _read_id); + if (ret) + goto free_pre_req; + + *read_id = _read_id; + + return ret; +free_pre_req: + spin_lock_irqsave(>rgn_state_lock, flags); + ufshpb_put_pre_req(hpb, pre_req); +unlock_out: + spin_unlock_irqrestore(>rgn_state_lock, flags); + bio_put(bio); + blk_put_request(req); + return ret; +} +
Re: [PATCH v29 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-15 09:31, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- +static struct attribute *hpb_dev_param_attrs[] = { + _attr_requeue_timeout_ms.attr, + NULL, +}; + +struct attribute_group ufs_sysfs_hpb_param_group = { + .name = "hpb_param_sysfs", + .attrs = hpb_dev_param_attrs, +}; + +static int ufshpb_pre_req_mempool_init(struct ufshpb_lu *hpb) +{ + struct ufshpb_req *pre_req = NULL; + int qd = hpb->sdev_ufs_lu->queue_depth / 2; + int i, j; + + INIT_LIST_HEAD(>lh_pre_req_free); + + hpb->pre_req = kcalloc(qd, sizeof(struct ufshpb_req), GFP_KERNEL); + hpb->throttle_pre_req = qd; + hpb->num_inflight_pre_req = 0; + + if (!hpb->pre_req) + goto release_mem; + + for (i = 0; i < qd; i++) { + pre_req = hpb->pre_req + i; + INIT_LIST_HEAD(_req->list_req); + pre_req->req = NULL; + pre_req->bio = NULL; Why don't prepare bio as same as wb.m_page? Won't that save more time for ufshpb_issue_pre_req()? Thanks, Can Guo. + + pre_req->wb.m_page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!pre_req->wb.m_page) { + for (j = 0; j < i; j++) + __free_page(hpb->pre_req[j].wb.m_page); + + goto release_mem; + } + list_add_tail(_req->list_req, >lh_pre_req_free); + } + + return 0; +release_mem: + kfree(hpb->pre_req); + return -ENOMEM; +} +
Re: [PATCH v5 05/10] scsi: ufshpb: Region inactivation in host mode
On 2021-03-02 21:24, Avri Altman wrote: I host mode, the host is expected to send HPB-WRITE-BUFFER with buffer-id = 0x1 when it inactivates a region. Use the map-requests pool as there is no point in assigning a designated cache for umap-requests. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 14 ++ drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 15 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 6f4fd22eaf2f..0744feb4d484 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -907,6 +907,7 @@ static int ufshpb_execute_umap_req(struct ufshpb_lu *hpb, blk_execute_rq_nowait(q, NULL, req, 1, ufshpb_umap_req_compl_fn); + hpb->stats.umap_req_cnt++; return 0; } @@ -1103,6 +1104,12 @@ static int ufshpb_issue_umap_req(struct ufshpb_lu *hpb, return -EAGAIN; } +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, + struct ufshpb_region *rgn) +{ + return ufshpb_issue_umap_req(hpb, rgn); +} + static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) { return ufshpb_issue_umap_req(hpb, NULL); @@ -1115,6 +1122,10 @@ static void __ufshpb_evict_region(struct ufshpb_lu *hpb, struct ufshpb_subregion *srgn; int srgn_idx; + + if (hpb->is_hcm && ufshpb_issue_umap_single_req(hpb, rgn)) __ufshpb_evict_region() is called with rgn_state_lock held and IRQ disabled, when ufshpb_issue_umap_single_req() invokes blk_execute_rq_nowait(), below warning shall pop up every time, fix it? void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk, struct request *rq, int at_head, rq_end_io_fn *done) { WARN_ON(irqs_disabled()); ... Thanks. Can Guo. + return; + lru_info = >lru_info; dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", rgn->rgn_idx); @@ -1855,6 +1866,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); ufshpb_sysfs_attr_show_func(rb_active_cnt); ufshpb_sysfs_attr_show_func(rb_inactive_cnt); ufshpb_sysfs_attr_show_func(map_req_cnt); +ufshpb_sysfs_attr_show_func(umap_req_cnt); static struct attribute *hpb_dev_stat_attrs[] = { _attr_hit_cnt.attr, @@ -1863,6 +1875,7 @@ static struct attribute *hpb_dev_stat_attrs[] = { _attr_rb_active_cnt.attr, _attr_rb_inactive_cnt.attr, _attr_map_req_cnt.attr, + _attr_umap_req_cnt.attr, NULL, }; @@ -1978,6 +1991,7 @@ static void ufshpb_stat_init(struct ufshpb_lu *hpb) hpb->stats.rb_active_cnt = 0; hpb->stats.rb_inactive_cnt = 0; hpb->stats.map_req_cnt = 0; + hpb->stats.umap_req_cnt = 0; } static void ufshpb_param_init(struct ufshpb_lu *hpb) diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index bd4308010466..84598a317897 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -186,6 +186,7 @@ struct ufshpb_stats { u64 rb_inactive_cnt; u64 map_req_cnt; u64 pre_req_cnt; + u64 umap_req_cnt; }; struct ufshpb_lu {
Re: [PATCH v5 03/10] scsi: ufshpb: Add region's reads counter
Hi Avri, On 2021-03-02 21:24, Avri Altman wrote: In host control mode, reads are the major source of activation trials. Keep track of those reads counters, for both active as well inactive regions. We reset the read counter upon write - we are only interested in "clean" reads. less intuitive however, is that we also reset it upon region's deactivation. Region deactivation is often due to the fact that eviction took place: a region become active on the expense of another. This is happening when the max-active-regions limit has crossed. If we don’t reset the counter, we will trigger a lot of trashing of the HPB database, since few reads (or even one) to the region that was deactivated, will trigger a re-activation trial. Keep those counters normalized, as we are using those reads as a comparative score, to make various decisions. If during consecutive normalizations an active region has exhaust its reads - inactivate it. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 102 -- drivers/scsi/ufs/ufshpb.h | 5 ++ 2 files changed, 92 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 044fec9854a0..a8f8d13af21a 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,6 +16,8 @@ #include "ufshpb.h" #include "../sd.h" +#define ACTIVATION_THRESHOLD 4 /* 4 IOs */ + /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; static mempool_t *ufshpb_mctx_pool; @@ -554,6 +556,21 @@ static int ufshpb_issue_pre_req(struct ufshpb_lu *hpb, struct scsi_cmnd *cmd, return ret; } +static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int rgn_idx, + int srgn_idx) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + list_del_init(>list_inact_rgn); + + if (list_empty(>list_act_srgn)) + list_add_tail(>list_act_srgn, >lh_act_srgn); +} + /* * This function will set up HPB read command using host-side L2P map data. */ @@ -600,12 +617,44 @@ int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) ufshpb_set_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, transfer_len); spin_unlock_irqrestore(>rgn_state_lock, flags); + + if (hpb->is_hcm) { + spin_lock_irqsave(>rgn_lock, flags); rgn_lock is never used in IRQ contexts, so no need of irqsave and irqrestore everywhere, which can impact performance. Please correct me if I am wrong. Meanwhile, have you ever initialized the rgn_lock before use it??? Thanks, Can Guo. + rgn->reads = 0; + spin_unlock_irqrestore(>rgn_lock, flags); + } + return 0; } if (!ufshpb_is_support_chunk(hpb, transfer_len)) return 0; + if (hpb->is_hcm) { + bool activate = false; + /* +* in host control mode, reads are the main source for +* activation trials. +*/ + spin_lock_irqsave(>rgn_lock, flags); + rgn->reads++; + if (rgn->reads == ACTIVATION_THRESHOLD) + activate = true; + spin_unlock_irqrestore(>rgn_lock, flags); + if (activate) { + spin_lock_irqsave(>rsp_list_lock, flags); + ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); + hpb->stats.rb_active_cnt++; + spin_unlock_irqrestore(>rsp_list_lock, flags); + dev_dbg(>sdev_ufs_lu->sdev_dev, + "activate region %d-%d\n", rgn_idx, srgn_idx); + } + + /* keep those counters normalized */ + if (rgn->reads > hpb->entries_per_srgn) + schedule_work(>ufshpb_normalization_work); + } + spin_lock_irqsave(>rgn_state_lock, flags); if (ufshpb_test_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, transfer_len)) { @@ -745,21 +794,6 @@ static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, return 0; } -static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int rgn_idx, - int srgn_idx) -{ - struct ufshpb_region *rgn; - struct ufshpb_subregion *srgn; - - rgn = hpb->rgn_tbl + rgn_idx; - srgn = rgn->srgn_tbl + srgn_idx; - - list_del_init(>list_inact_rgn); - - if (list_empty(>list_act_srgn)) - list_add_tail(>list_act_srgn, >lh_act_srgn); -} - static void uf
Re: [PATCH v26 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-12 15:17, Daejun Park wrote: > This is a patch for managing L2P map in HPB module. > > The HPB divides logical addresses into several regions. A region > consists > of several sub-regions. The sub-region is a basic unit where L2P > mapping is > managed. The driver loads L2P mapping data of each sub-region. The > loaded > sub-region is called active-state. The HPB driver unloads L2P mapping > data > as region unit. The unloaded region is called inactive-state. > > Sub-region/region candidates to be loaded and unloaded are delivered > from > the UFS device. The UFS device delivers the recommended active > sub-region > and inactivate region to the driver using sensedata. > The HPB module performs L2P mapping management on the host through the > delivered information. > > A pinned region is a pre-set regions on the UFS device that is always > activate-state. > > The data structure for map data request and L2P map uses mempool API, > minimizing allocation overhead while avoiding static allocation. > > The mininum size of the memory pool used in the HPB is implemented > as a module parameter, so that it can be configurable by the user. > > To gurantee a minimum memory pool size of 4MB: > ufshpb_host_map_kbytes=4096 > > The map_work manages active/inactive by 2 "to-do" lists. > Each hpb lun maintains 2 "to-do" lists: > hpb->lh_inact_rgn - regions to be inactivated, and > hpb->lh_act_srgn - subregions to be activated > Those lists are maintained on IO completion. > > Reviewed-by: Bart Van Assche > Reviewed-by: Can Guo > Acked-by: Avri Altman > Tested-by: Bean Huo > Signed-off-by: Daejun Park > --- > drivers/scsi/ufs/ufs.h| 36 ++ > drivers/scsi/ufs/ufshcd.c |4 + > drivers/scsi/ufs/ufshpb.c | 1091 - > drivers/scsi/ufs/ufshpb.h | 65 +++ > 4 files changed, 1181 insertions(+), 15 deletions(-) > > diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h > index 65563635e20e..957763db1006 100644 > --- a/drivers/scsi/ufs/ufs.h > +++ b/drivers/scsi/ufs/ufs.h > @@ -472,6 +472,41 @@ struct utp_cmd_rsp { > u8 sense_data[UFS_SENSE_SIZE]; > }; > > +struct ufshpb_active_field { > +__be16 active_rgn; > +__be16 active_srgn; > +}; > +#define HPB_ACT_FIELD_SIZE 4 > + > +/** > + * struct utp_hpb_rsp - Response UPIU structure > + * @residual_transfer_count: Residual transfer count DW-3 > + * @reserved1: Reserved double words DW-4 to DW-7 > + * @sense_data_len: Sense data length DW-8 U16 > + * @desc_type: Descriptor type of sense data > + * @additional_len: Additional length of sense data > + * @hpb_op: HPB operation type > + * @lun: LUN of response UPIU > + * @active_rgn_cnt: Active region count > + * @inactive_rgn_cnt: Inactive region count > + * @hpb_active_field: Recommended to read HPB region and subregion > + * @hpb_inactive_field: To be inactivated HPB region and subregion > + */ > +struct utp_hpb_rsp { > +__be32 residual_transfer_count; > +__be32 reserved1[4]; > +__be16 sense_data_len; > +u8 desc_type; > +u8 additional_len; > +u8 hpb_op; > +u8 lun; > +u8 active_rgn_cnt; > +u8 inactive_rgn_cnt; > +struct ufshpb_active_field hpb_active_field[2]; > +__be16 hpb_inactive_field[2]; > +}; > +#define UTP_HPB_RSP_SIZE 40 > + > /** > * struct utp_upiu_rsp - general upiu response structure > * @header: UPIU header structure DW-0 to DW-2 > @@ -482,6 +517,7 @@ struct utp_upiu_rsp { > struct utp_upiu_header header; > union { > struct utp_cmd_rsp sr; > +struct utp_hpb_rsp hr; > struct utp_upiu_query qr; > }; > }; > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 49b3d5d24fa6..5852ff44c3cc 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -5021,6 +5021,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, > struct ufshcd_lrb *lrbp) > */ > pm_runtime_get_noresume(hba->dev); > } > + > +if (scsi_status == SAM_STAT_GOOD) > +ufshpb_rsp_upiu(hba, lrbp); > break; > case UPIU_TRANSACTION_REJECT_UPIU: > /* TODO: handle Reject UPIU Response */ > @@ -9221,6 +9224,7 @@ EXPORT_SYMBOL(ufshcd_shutdown); > void ufshcd_remove(struct ufs_hba *hba) > { > ufs_bsg_remove(hba); > +ufshpb_remove(hba); >
Re: [PATCH v5 07/10] scsi: ufshpb: Add "Cold" regions timer
On 2021-03-02 21:25, Avri Altman wrote: In order not to hang on to “cold” regions, we shall inactivate a region that has no READ access for a predefined amount of time - READ_TO_MS. For that purpose we shall monitor the active regions list, polling it on every POLLING_INTERVAL_MS. On timeout expiry we shall add the region to the "to-be-inactivated" list, unless it is clean and did not exhaust its READ_TO_EXPIRIES - another parameter. All this does not apply to pinned regions. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 65 +++ drivers/scsi/ufs/ufshpb.h | 6 2 files changed, 71 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 0034fa03fdc6..89a930e72cff 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -18,6 +18,9 @@ #define ACTIVATION_THRESHOLD 4 /* 4 IOs */ #define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 6) /* 256 IOs */ +#define READ_TO_MS 1000 +#define READ_TO_EXPIRIES 100 +#define POLLING_INTERVAL_MS 200 /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -1024,12 +1027,61 @@ static int ufshpb_check_srgns_issue_state(struct ufshpb_lu *hpb, return 0; } +static void ufshpb_read_to_handler(struct work_struct *work) +{ + struct delayed_work *dwork = to_delayed_work(work); + struct ufshpb_lu *hpb; struct ufshpb_lu *hpb = container_of(work, struct ufshpb_lu, ufshpb_read_to_work.work); usually we use this to get data of a delayed work. + struct victim_select_info *lru_info; struct victim_select_info *lru_info = >lru_info; This can save some lines. Thanks, Can Guo. + struct ufshpb_region *rgn; + unsigned long flags; + LIST_HEAD(expired_list); + + spin_lock_irqsave(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) { + bool timedout = ktime_after(ktime_get(), rgn->read_timeout); + + if (timedout) { + rgn->read_timeout_expiries--; + if (is_rgn_dirty(rgn) || + rgn->read_timeout_expiries == 0) + list_add(>list_expired_rgn, _list); + else + rgn->read_timeout = ktime_add_ms(ktime_get(), +READ_TO_MS); + } + } + + spin_unlock_irqrestore(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _list, list_expired_rgn) { + list_del_init(>list_expired_rgn); + spin_lock_irqsave(>rsp_list_lock, flags); + ufshpb_update_inactive_info(hpb, rgn->rgn_idx); + hpb->stats.rb_inactive_cnt++; + spin_unlock_irqrestore(>rsp_list_lock, flags); + } + + ufshpb_kick_map_work(hpb); + + schedule_delayed_work(>ufshpb_read_to_work, + msecs_to_jiffies(POLLING_INTERVAL_MS)); +} + static void ufshpb_add_lru_info(struct victim_select_info *lru_info, struct ufshpb_region *rgn) { rgn->rgn_state = HPB_RGN_ACTIVE; list_add_tail(>list_lru_rgn, _info->lh_lru_rgn); atomic_inc(_info->active_cnt); + if (rgn->hpb->is_hcm) { + rgn->read_timeout = ktime_add_ms(ktime_get(), READ_TO_MS); + rgn->read_timeout_expiries = READ_TO_EXPIRIES; + } } static void ufshpb_hit_lru_info(struct victim_select_info *lru_info, @@ -1813,6 +1865,7 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_inact_rgn); INIT_LIST_HEAD(>list_lru_rgn); + INIT_LIST_HEAD(>list_expired_rgn); if (rgn_idx == hpb->rgns_per_lu - 1) { srgn_cnt = ((hpb->srgns_per_lu - 1) % @@ -1834,6 +1887,7 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } rgn->rgn_flags = 0; + rgn->hpb = hpb; } return 0; @@ -2053,6 +2107,8 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) ufshpb_normalization_work_handler); INIT_WORK(>ufshpb_lun_reset_work, ufshpb_reset_work_handler); + INIT_DELAYED_WORK(>ufshpb_read_to_work, + ufshpb_read_to_handler); } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", @@ -2087,6 +2143,10 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) ufshpb_stat_init(hpb); ufshpb_param_init(hpb); + if (hpb->is_hcm) + schedule_delayed_work(>ufshpb_read_to_work, +
Re: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response
On 2021-03-02 21:24, Avri Altman wrote: The spec does not define what is the host's recommended response when the device send hpb dev reset response (oper 0x2). We will update all active hpb regions: mark them and do that on the next read. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 47 --- drivers/scsi/ufs/ufshpb.h | 2 ++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 0744feb4d484..0034fa03fdc6 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) if (rgn->reads == ACTIVATION_THRESHOLD) activate = true; spin_unlock_irqrestore(>rgn_lock, flags); - if (activate) { + if (activate || + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { spin_lock_irqsave(>rsp_list_lock, flags); ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); hpb->stats.rb_active_cnt++; @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case HPB_RSP_DEV_RESET: dev_warn(>sdev_ufs_lu->sdev_dev, "UFS device lost HPB information during PM.\n"); + + if (hpb->is_hcm) { + struct scsi_device *sdev; bool need_reset = false; + + __shost_for_each_device(sdev, hba->host) { + struct ufshpb_lu *h = sdev->hostdata; + + if (!h) + continue; + + need_reset = true; + } if (need_reset) schedule_work(>ufshpb_lun_reset_work); At last, scheduling only one reset work shall be enough, otherwise multiple reset work can be flying in parallel, so maybe above changes? + } + break; default: dev_notice(>sdev_ufs_lu->sdev_dev, @@ -1594,6 +1609,25 @@ static void ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) spin_unlock_irqrestore(>rsp_list_lock, flags); } +static void ufshpb_reset_work_handler(struct work_struct *work) +{ + struct ufshpb_lu *hpb; struct ufshpb_lu *hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); + struct victim_select_info *lru_info; struct victim_select_info *lru_info = >lru_info; This can save some lines. Thanks, Can Guo. + struct ufshpb_region *rgn; + unsigned long flags; + + spin_lock_irqsave(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) + set_bit(RGN_FLAG_UPDATE, >rgn_flags); + + spin_unlock_irqrestore(>rgn_state_lock, flags); +} + static void ufshpb_normalization_work_handler(struct work_struct *work) { struct ufshpb_lu *hpb; @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } else { rgn->rgn_state = HPB_RGN_INACTIVE; } + + rgn->rgn_flags = 0; } return 0; @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_hpb_lu); INIT_WORK(>map_work, ufshpb_map_work_handler); - if (hpb->is_hcm) + if (hpb->is_hcm) { INIT_WORK(>ufshpb_normalization_work, ufshpb_normalization_work_handler); + INIT_WORK(>ufshpb_lun_reset_work, + ufshpb_reset_work_handler); + } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", sizeof(struct ufshpb_req), 0, 0, NULL); @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct ufshpb_lu *hpb) static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) { - if (hpb->is_hcm) + if (hpb->is_hcm) { + cancel_work_sync(>ufshpb_lun_reset_work); cancel_work_sync(>ufshpb_normalization_work); + } cancel_work_sync(>map_work); } diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 84598a317897..37c1b0ea0c0a 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -121,6 +121,7 @@ struct ufshpb_region { struct list_head list_lru_rgn; unsigned long rgn_flags; #define RGN_FLAG_DIRTY 0 +#define RGN_FLAG_UPDATE 1 /* region reads - for host mode */ spinlock_t rgn_lock; @@ -217,6 +218,7 @@ struct ufshpb_lu { /* for selecting victim */
Re: [PATCH v26 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-12 09:48, Daejun Park wrote: > This is a patch for managing L2P map in HPB module. > > The HPB divides logical addresses into several regions. A region > consists > of several sub-regions. The sub-region is a basic unit where L2P > mapping is > managed. The driver loads L2P mapping data of each sub-region. The > loaded > sub-region is called active-state. The HPB driver unloads L2P mapping > data > as region unit. The unloaded region is called inactive-state. > > Sub-region/region candidates to be loaded and unloaded are delivered > from > the UFS device. The UFS device delivers the recommended active > sub-region > and inactivate region to the driver using sensedata. > The HPB module performs L2P mapping management on the host through the > delivered information. > > A pinned region is a pre-set regions on the UFS device that is always > activate-state. > > The data structure for map data request and L2P map uses mempool API, > minimizing allocation overhead while avoiding static allocation. > > The mininum size of the memory pool used in the HPB is implemented > as a module parameter, so that it can be configurable by the user. > > To gurantee a minimum memory pool size of 4MB: > ufshpb_host_map_kbytes=4096 > > The map_work manages active/inactive by 2 "to-do" lists. > Each hpb lun maintains 2 "to-do" lists: > hpb->lh_inact_rgn - regions to be inactivated, and > hpb->lh_act_srgn - subregions to be activated > Those lists are maintained on IO completion. > > Reviewed-by: Bart Van Assche > Reviewed-by: Can Guo > Acked-by: Avri Altman > Tested-by: Bean Huo > Signed-off-by: Daejun Park > --- > drivers/scsi/ufs/ufs.h| 36 ++ > drivers/scsi/ufs/ufshcd.c |4 + > drivers/scsi/ufs/ufshpb.c | 1091 - > drivers/scsi/ufs/ufshpb.h | 65 +++ > 4 files changed, 1181 insertions(+), 15 deletions(-) > > diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h > index 65563635e20e..957763db1006 100644 > --- a/drivers/scsi/ufs/ufs.h > +++ b/drivers/scsi/ufs/ufs.h > @@ -472,6 +472,41 @@ struct utp_cmd_rsp { > u8 sense_data[UFS_SENSE_SIZE]; > }; > > +struct ufshpb_active_field { > +__be16 active_rgn; > +__be16 active_srgn; > +}; > +#define HPB_ACT_FIELD_SIZE 4 > + > +/** > + * struct utp_hpb_rsp - Response UPIU structure > + * @residual_transfer_count: Residual transfer count DW-3 > + * @reserved1: Reserved double words DW-4 to DW-7 > + * @sense_data_len: Sense data length DW-8 U16 > + * @desc_type: Descriptor type of sense data > + * @additional_len: Additional length of sense data > + * @hpb_op: HPB operation type > + * @lun: LUN of response UPIU > + * @active_rgn_cnt: Active region count > + * @inactive_rgn_cnt: Inactive region count > + * @hpb_active_field: Recommended to read HPB region and subregion > + * @hpb_inactive_field: To be inactivated HPB region and subregion > + */ > +struct utp_hpb_rsp { > +__be32 residual_transfer_count; > +__be32 reserved1[4]; > +__be16 sense_data_len; > +u8 desc_type; > +u8 additional_len; > +u8 hpb_op; > +u8 lun; > +u8 active_rgn_cnt; > +u8 inactive_rgn_cnt; > +struct ufshpb_active_field hpb_active_field[2]; > +__be16 hpb_inactive_field[2]; > +}; > +#define UTP_HPB_RSP_SIZE 40 > + > /** > * struct utp_upiu_rsp - general upiu response structure > * @header: UPIU header structure DW-0 to DW-2 > @@ -482,6 +517,7 @@ struct utp_upiu_rsp { > struct utp_upiu_header header; > union { > struct utp_cmd_rsp sr; > +struct utp_hpb_rsp hr; > struct utp_upiu_query qr; > }; > }; > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 49b3d5d24fa6..5852ff44c3cc 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -5021,6 +5021,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, > struct ufshcd_lrb *lrbp) > */ > pm_runtime_get_noresume(hba->dev); > } > + > +if (scsi_status == SAM_STAT_GOOD) > +ufshpb_rsp_upiu(hba, lrbp); > break; > case UPIU_TRANSACTION_REJECT_UPIU: > /* TODO: handle Reject UPIU Response */ > @@ -9221,6 +9224,7 @@ EXPORT_SYMBOL(ufshcd_shutdown); > void ufshcd_remove(struct ufs_hba *hba) > { > ufs_bsg_remove(hba); > +ufshpb_remove(hba); >
Re: [PATCH v11 2/2] ufs: sysfs: Resume the proper scsi device
On 2021-03-12 06:19, Asutosh Das wrote: Resumes the actual scsi device the unit descriptor of which is being accessed instead of the hba alone. Signed-off-by: Asutosh Das You lost my reviewed-by: reviewed-by: Can Guo --- drivers/scsi/ufs/ufs-sysfs.c | 30 +- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index acc54f5..3fc182b 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -245,9 +245,9 @@ static ssize_t wb_on_store(struct device *dev, struct device_attribute *attr, goto out; } - pm_runtime_get_sync(hba->dev); + scsi_autopm_get_device(hba->sdev_ufs_device); res = ufshcd_wb_ctrl(hba, wb_enable); - pm_runtime_put_sync(hba->dev); + scsi_autopm_put_device(hba->sdev_ufs_device); out: up(>host_sem); return res < 0 ? res : count; @@ -297,10 +297,10 @@ static ssize_t ufs_sysfs_read_desc_param(struct ufs_hba *hba, goto out; } - pm_runtime_get_sync(hba->dev); + scsi_autopm_get_device(hba->sdev_ufs_device); ret = ufshcd_read_desc_param(hba, desc_id, desc_index, param_offset, desc_buf, param_size); - pm_runtime_put_sync(hba->dev); + scsi_autopm_put_device(hba->sdev_ufs_device); if (ret) { ret = -EINVAL; goto out; @@ -678,7 +678,7 @@ static ssize_t _name##_show(struct device *dev,\ up(>host_sem); \ return -ENOMEM; \ } \ - pm_runtime_get_sync(hba->dev); \ + scsi_autopm_get_device(hba->sdev_ufs_device);\ ret = ufshcd_query_descriptor_retry(hba,\ UPIU_QUERY_OPCODE_READ_DESC, QUERY_DESC_IDN_DEVICE, \ 0, 0, desc_buf, _len); \ @@ -695,7 +695,7 @@ static ssize_t _name##_show(struct device *dev,\ goto out; \ ret = sysfs_emit(buf, "%s\n", desc_buf); \ out: \ - pm_runtime_put_sync(hba->dev); \ + scsi_autopm_put_device(hba->sdev_ufs_device);\ kfree(desc_buf);\ up(>host_sem); \ return ret; \ @@ -744,10 +744,10 @@ static ssize_t _name##_show(struct device *dev,\ } \ if (ufshcd_is_wb_flags(QUERY_FLAG_IDN##_uname)) \ index = ufshcd_wb_get_query_index(hba); \ - pm_runtime_get_sync(hba->dev); \ + scsi_autopm_get_device(hba->sdev_ufs_device);\ ret = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG, \ QUERY_FLAG_IDN##_uname, index, ); \ - pm_runtime_put_sync(hba->dev); \ + scsi_autopm_put_device(hba->sdev_ufs_device);\ if (ret) { \ ret = -EINVAL; \ goto out; \ @@ -813,10 +813,10 @@ static ssize_t _name##_show(struct device *dev,\ } \ if (ufshcd_is_wb_attrs(QUERY_ATTR_IDN##_uname)) \ index = ufshcd_wb_get_query_index(hba); \ - pm_runtime_get_sync(hba->dev); \ + scsi_autopm_get_device(hba->sdev_ufs_device);\ ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR, \ QUERY_ATTR_IDN##_uname, index, 0, ); \ - pm_runtime_put_sync(hba->dev); \ + scsi_autopm_put_device(hba->sdev_ufs_device);\ if (ret) { \ ret = -EINVAL; \ goto out; \ @@ -899,11 +899,15 @@ static ssize_t _pname##_show(struct device *dev, \ struct scsi_device *sdev = to_scsi_
Re: [PATCH v26 3/4] scsi: ufs: Prepare HPB read for cached sub-region
On 2021-03-03 14:28, Daejun Park wrote: This patch changes the read I/O to the HPB read I/O. If the logical address of the read I/O belongs to active sub-region, the HPB driver modifies the read I/O command to HPB read. It modifies the UPIU command of UFS instead of modifying the existing SCSI command. In the HPB version 1.0, the maximum read I/O size that can be converted to HPB read is 4KB. The dirty map of the active sub-region prevents an incorrect HPB read that has stale physical page number which is updated by previous write I/O. Reviewed-by: Can Guo Reviewed-by: Bart Van Assche Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufshcd.c | 2 + drivers/scsi/ufs/ufshpb.c | 253 +- drivers/scsi/ufs/ufshpb.h | 2 + 3 files changed, 254 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 5852ff44c3cc..851c01a26207 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2656,6 +2656,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) lrbp->req_abort_skip = false; + ufshpb_prep(hba, lrbp); + ufshcd_comp_scsi_upiu(hba, lrbp); err = ufshcd_map_sg(hba, lrbp); diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 8abadb0e010a..c75a6816a03f 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -46,6 +46,29 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int state) atomic_set(>hpb_state, state); } +static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn, + struct ufshpb_subregion *srgn) +{ + return rgn->rgn_state != HPB_RGN_INACTIVE && + srgn->srgn_state == HPB_SRGN_VALID; +} + +static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd) +{ + return req_op(cmd->request) == REQ_OP_READ; +} + +static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd) +{ + return op_is_write(req_op(cmd->request)) || + op_is_discard(req_op(cmd->request)); +} + +static bool ufshpb_is_support_chunk(int transfer_len) +{ + return transfer_len <= HPB_MULTI_CHUNK_HIGH; +} + static bool ufshpb_is_general_lun(int lun) { return lun < UFS_UPIU_MAX_UNIT_NUM_ID; @@ -80,8 +103,8 @@ static void ufshpb_kick_map_work(struct ufshpb_lu *hpb) } static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba, -struct ufshcd_lrb *lrbp, -struct utp_hpb_rsp *rsp_field) + struct ufshcd_lrb *lrbp, + struct utp_hpb_rsp *rsp_field) { /* Check HPB_UPDATE_ALERT */ if (!(lrbp->ucd_rsp_ptr->header.dword_2 & @@ -107,6 +130,230 @@ static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba, return true; } +static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, +int srgn_idx, int srgn_offset, int cnt) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + int set_bit_len; + int bitmap_len; + +next_srgn: + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + if (likely(!srgn->is_last)) + bitmap_len = hpb->entries_per_srgn; + else + bitmap_len = hpb->last_srgn_entries; + + if ((srgn_offset + cnt) > bitmap_len) + set_bit_len = bitmap_len - srgn_offset; + else + set_bit_len = cnt; + + if (rgn->rgn_state != HPB_RGN_INACTIVE && + srgn->srgn_state == HPB_SRGN_VALID) + bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len); + + srgn_offset = 0; + if (++srgn_idx == hpb->srgns_per_rgn) { + srgn_idx = 0; + rgn_idx++; + } + + cnt -= set_bit_len; + if (cnt > 0) + goto next_srgn; +} + +static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, + int srgn_idx, int srgn_offset, int cnt) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + int bitmap_len; + int bit_len; + +next_srgn: + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + if (likely(!srgn->is_last)) + bitmap_len = hpb->entries_per_srgn; + else + bitmap_len = hpb->last_srgn_entries; + + if (!ufshpb_is_valid_srgn(rgn, srgn)) + return true; + + /* +* If the region state is active, mctx must be allocated. +* In this case, check whether the region is evicted or +* mctx allcation fail. +*/ + if (unlikely(!srgn->mctx)) { + dev_err(>sdev_ufs_lu->sdev
Re: [PATCH v26 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-03 14:28, Daejun Park wrote: This is a patch for managing L2P map in HPB module. The HPB divides logical addresses into several regions. A region consists of several sub-regions. The sub-region is a basic unit where L2P mapping is managed. The driver loads L2P mapping data of each sub-region. The loaded sub-region is called active-state. The HPB driver unloads L2P mapping data as region unit. The unloaded region is called inactive-state. Sub-region/region candidates to be loaded and unloaded are delivered from the UFS device. The UFS device delivers the recommended active sub-region and inactivate region to the driver using sensedata. The HPB module performs L2P mapping management on the host through the delivered information. A pinned region is a pre-set regions on the UFS device that is always activate-state. The data structure for map data request and L2P map uses mempool API, minimizing allocation overhead while avoiding static allocation. The mininum size of the memory pool used in the HPB is implemented as a module parameter, so that it can be configurable by the user. To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096 The map_work manages active/inactive by 2 "to-do" lists. Each hpb lun maintains 2 "to-do" lists: hpb->lh_inact_rgn - regions to be inactivated, and hpb->lh_act_srgn - subregions to be activated Those lists are maintained on IO completion. Reviewed-by: Bart Van Assche Reviewed-by: Can Guo Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h| 36 ++ drivers/scsi/ufs/ufshcd.c |4 + drivers/scsi/ufs/ufshpb.c | 1091 - drivers/scsi/ufs/ufshpb.h | 65 +++ 4 files changed, 1181 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 65563635e20e..957763db1006 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -472,6 +472,41 @@ struct utp_cmd_rsp { u8 sense_data[UFS_SENSE_SIZE]; }; ... +/* + * This function will parse recommended active subregion information in sense + * data field of response UPIU with SAM_STAT_GOOD state. + */ +void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) +{ + struct ufshpb_lu *hpb = ufshpb_get_hpb_data(lrbp->cmd->device); + struct utp_hpb_rsp *rsp_field = >ucd_rsp_ptr->hr; + int data_seg_len; + + if (unlikely(lrbp->lun != rsp_field->lun)) { + struct scsi_device *sdev; + bool found = false; + + __shost_for_each_device(sdev, hba->host) { + hpb = ufshpb_get_hpb_data(sdev); + + if (!hpb) + continue; + + if (rsp_field->lun == hpb->lun) { + found = true; + break; + } + } + + if (!found) + return; + } + + if (!hpb) + return; + + if ((ufshpb_get_state(hpb) != HPB_PRESENT) && + (ufshpb_get_state(hpb) != HPB_SUSPEND)) { + dev_notice(>sdev_ufs_lu->sdev_dev, + "%s: ufshpb state is not PRESENT/SUSPEND\n", + __func__); Please mute these prints before hpb is fully initilized, otherwise there can be tons of these prints during bootup. Say set a flag in ufshpb_hpb_lu_prepared() and check for that flag - just a rough idea. Thanks, Can Guo. + return; + } +
Re: [PATCH v26 2/4] scsi: ufs: L2P map management for HPB read
On 2021-03-03 14:28, Daejun Park wrote: This is a patch for managing L2P map in HPB module. The HPB divides logical addresses into several regions. A region consists of several sub-regions. The sub-region is a basic unit where L2P mapping is managed. The driver loads L2P mapping data of each sub-region. The loaded sub-region is called active-state. The HPB driver unloads L2P mapping data as region unit. The unloaded region is called inactive-state. Sub-region/region candidates to be loaded and unloaded are delivered from the UFS device. The UFS device delivers the recommended active sub-region and inactivate region to the driver using sensedata. The HPB module performs L2P mapping management on the host through the delivered information. A pinned region is a pre-set regions on the UFS device that is always activate-state. The data structure for map data request and L2P map uses mempool API, minimizing allocation overhead while avoiding static allocation. The mininum size of the memory pool used in the HPB is implemented as a module parameter, so that it can be configurable by the user. To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096 The map_work manages active/inactive by 2 "to-do" lists. Each hpb lun maintains 2 "to-do" lists: hpb->lh_inact_rgn - regions to be inactivated, and hpb->lh_act_srgn - subregions to be activated Those lists are maintained on IO completion. Reviewed-by: Bart Van Assche Reviewed-by: Can Guo Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h| 36 ++ drivers/scsi/ufs/ufshcd.c |4 + drivers/scsi/ufs/ufshpb.c | 1091 - drivers/scsi/ufs/ufshpb.h | 65 +++ 4 files changed, 1181 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 65563635e20e..957763db1006 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -472,6 +472,41 @@ struct utp_cmd_rsp { u8 sense_data[UFS_SENSE_SIZE]; }; +struct ufshpb_active_field { + __be16 active_rgn; + __be16 active_srgn; +}; +#define HPB_ACT_FIELD_SIZE 4 + +/** + * struct utp_hpb_rsp - Response UPIU structure + * @residual_transfer_count: Residual transfer count DW-3 + * @reserved1: Reserved double words DW-4 to DW-7 + * @sense_data_len: Sense data length DW-8 U16 + * @desc_type: Descriptor type of sense data + * @additional_len: Additional length of sense data + * @hpb_op: HPB operation type + * @lun: LUN of response UPIU + * @active_rgn_cnt: Active region count + * @inactive_rgn_cnt: Inactive region count + * @hpb_active_field: Recommended to read HPB region and subregion + * @hpb_inactive_field: To be inactivated HPB region and subregion + */ +struct utp_hpb_rsp { + __be32 residual_transfer_count; + __be32 reserved1[4]; + __be16 sense_data_len; + u8 desc_type; + u8 additional_len; + u8 hpb_op; + u8 lun; + u8 active_rgn_cnt; + u8 inactive_rgn_cnt; + struct ufshpb_active_field hpb_active_field[2]; + __be16 hpb_inactive_field[2]; +}; +#define UTP_HPB_RSP_SIZE 40 + /** * struct utp_upiu_rsp - general upiu response structure * @header: UPIU header structure DW-0 to DW-2 @@ -482,6 +517,7 @@ struct utp_upiu_rsp { struct utp_upiu_header header; union { struct utp_cmd_rsp sr; + struct utp_hpb_rsp hr; struct utp_upiu_query qr; }; }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 49b3d5d24fa6..5852ff44c3cc 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5021,6 +5021,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) */ pm_runtime_get_noresume(hba->dev); } + + if (scsi_status == SAM_STAT_GOOD) + ufshpb_rsp_upiu(hba, lrbp); break; case UPIU_TRANSACTION_REJECT_UPIU: /* TODO: handle Reject UPIU Response */ @@ -9221,6 +9224,7 @@ EXPORT_SYMBOL(ufshcd_shutdown); void ufshcd_remove(struct ufs_hba *hba) { ufs_bsg_remove(hba); + ufshpb_remove(hba); ufs_sysfs_remove_nodes(hba->dev); blk_cleanup_queue(hba->tmf_queue); blk_mq_free_tag_set(>tmf_tag_set); diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 1a72f6541510..8abadb0e010a 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,6 +16,16 @@ #include "ufshpb.h" #include "../sd.h" +/* memory management */ +static struct kmem_cache *ufshpb_mctx_cache; +static mempool_t *ufshpb_mctx_pool; +static mempool_t *ufshpb_page_pool; +/* A cache size of 2MB can cache ppn in the 1GB range. */ +static unsigned int ufshpb_host_map_kbytes =
Re: [PATCH v5 05/10] scsi: ufshpb: Region inactivation in host mode
On 2021-03-02 21:24, Avri Altman wrote: I host mode, the host is expected to send HPB-WRITE-BUFFER with In host mode, buffer-id = 0x1 when it inactivates a region. Use the map-requests pool as there is no point in assigning a designated cache for umap-requests. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 14 ++ drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 15 insertions(+) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 6f4fd22eaf2f..0744feb4d484 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -907,6 +907,7 @@ static int ufshpb_execute_umap_req(struct ufshpb_lu *hpb, blk_execute_rq_nowait(q, NULL, req, 1, ufshpb_umap_req_compl_fn); + hpb->stats.umap_req_cnt++; return 0; } @@ -1103,6 +1104,12 @@ static int ufshpb_issue_umap_req(struct ufshpb_lu *hpb, return -EAGAIN; } +static int ufshpb_issue_umap_single_req(struct ufshpb_lu *hpb, + struct ufshpb_region *rgn) +{ + return ufshpb_issue_umap_req(hpb, rgn); +} + static int ufshpb_issue_umap_all_req(struct ufshpb_lu *hpb) { return ufshpb_issue_umap_req(hpb, NULL); @@ -1115,6 +1122,10 @@ static void __ufshpb_evict_region(struct ufshpb_lu *hpb, struct ufshpb_subregion *srgn; int srgn_idx; + No need of this blank line. Regards, Can Guo. + if (hpb->is_hcm && ufshpb_issue_umap_single_req(hpb, rgn)) + return; + lru_info = >lru_info; dev_dbg(>sdev_ufs_lu->sdev_dev, "evict region %d\n", rgn->rgn_idx); @@ -1855,6 +1866,7 @@ ufshpb_sysfs_attr_show_func(rb_noti_cnt); ufshpb_sysfs_attr_show_func(rb_active_cnt); ufshpb_sysfs_attr_show_func(rb_inactive_cnt); ufshpb_sysfs_attr_show_func(map_req_cnt); +ufshpb_sysfs_attr_show_func(umap_req_cnt); static struct attribute *hpb_dev_stat_attrs[] = { _attr_hit_cnt.attr, @@ -1863,6 +1875,7 @@ static struct attribute *hpb_dev_stat_attrs[] = { _attr_rb_active_cnt.attr, _attr_rb_inactive_cnt.attr, _attr_map_req_cnt.attr, + _attr_umap_req_cnt.attr, NULL, }; @@ -1978,6 +1991,7 @@ static void ufshpb_stat_init(struct ufshpb_lu *hpb) hpb->stats.rb_active_cnt = 0; hpb->stats.rb_inactive_cnt = 0; hpb->stats.map_req_cnt = 0; + hpb->stats.umap_req_cnt = 0; } static void ufshpb_param_init(struct ufshpb_lu *hpb) diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index bd4308010466..84598a317897 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -186,6 +186,7 @@ struct ufshpb_stats { u64 rb_inactive_cnt; u64 map_req_cnt; u64 pre_req_cnt; + u64 umap_req_cnt; }; struct ufshpb_lu {
Re: [PATCH v5 03/10] scsi: ufshpb: Add region's reads counter
On 2021-03-11 16:04, Avri Altman wrote: > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c > index 044fec9854a0..a8f8d13af21a 100644 > --- a/drivers/scsi/ufs/ufshpb.c > +++ b/drivers/scsi/ufs/ufshpb.c > @@ -16,6 +16,8 @@ > #include "ufshpb.h" > #include "../sd.h" > > +#define ACTIVATION_THRESHOLD 4 /* 4 IOs */ Can this param be added as a sysfs entry? Yes. Daejun asked me that as well, so the last patch makes all logic parameter configurable. Thanks, Avri Ok, thanks. I haven't reach the last one, absorbing them one by one. Can Guo. Thanks, Can Guo
Re: [PATCH v5 04/10] scsi: ufshpb: Make eviction depends on region's reads
Hi Avri, On 2021-03-02 21:24, Avri Altman wrote: In host mode, eviction is considered an extreme measure. verify that the entering region has enough reads, and the exiting region has much less reads. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index a8f8d13af21a..6f4fd22eaf2f 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -17,6 +17,7 @@ #include "../sd.h" #define ACTIVATION_THRESHOLD 4 /* 4 IOs */ +#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 6) /* 256 IOs */ Same here, can this be added as a sysfs entry? Thanks, Can Guo. /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -1050,6 +1051,13 @@ static struct ufshpb_region *ufshpb_victim_lru_info(struct ufshpb_lu *hpb) if (ufshpb_check_srgns_issue_state(hpb, rgn)) continue; + /* +* in host control mode, verify that the exiting region +* has less reads +*/ + if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1)) + continue; + victim_rgn = rgn; break; } @@ -1235,7 +1243,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu *hpb, static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) { - struct ufshpb_region *victim_rgn; + struct ufshpb_region *victim_rgn = NULL; struct victim_select_info *lru_info = >lru_info; unsigned long flags; int ret = 0; @@ -1263,6 +1271,16 @@ static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) * because the device could detect this region * by not issuing HPB_READ */ + + /* +* in host control mode, verify that the entering +* region has enough reads +*/ + if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) { + ret = -EACCES; + goto out; + } + victim_rgn = ufshpb_victim_lru_info(hpb); if (!victim_rgn) { dev_warn(>sdev_ufs_lu->sdev_dev,
Re: [PATCH v5 03/10] scsi: ufshpb: Add region's reads counter
Hi Avri, On 2021-03-02 21:24, Avri Altman wrote: In host control mode, reads are the major source of activation trials. Keep track of those reads counters, for both active as well inactive regions. We reset the read counter upon write - we are only interested in "clean" reads. less intuitive however, is that we also reset it upon region's deactivation. Region deactivation is often due to the fact that eviction took place: a region become active on the expense of another. This is happening when the max-active-regions limit has crossed. If we don’t reset the counter, we will trigger a lot of trashing of the HPB database, since few reads (or even one) to the region that was deactivated, will trigger a re-activation trial. Keep those counters normalized, as we are using those reads as a comparative score, to make various decisions. If during consecutive normalizations an active region has exhaust its reads - inactivate it. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 102 -- drivers/scsi/ufs/ufshpb.h | 5 ++ 2 files changed, 92 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 044fec9854a0..a8f8d13af21a 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,6 +16,8 @@ #include "ufshpb.h" #include "../sd.h" +#define ACTIVATION_THRESHOLD 4 /* 4 IOs */ Can this param be added as a sysfs entry? Thanks, Can Guo + /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; static mempool_t *ufshpb_mctx_pool; @@ -554,6 +556,21 @@ static int ufshpb_issue_pre_req(struct ufshpb_lu *hpb, struct scsi_cmnd *cmd, return ret; } +static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int rgn_idx, + int srgn_idx) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + list_del_init(>list_inact_rgn); + + if (list_empty(>list_act_srgn)) + list_add_tail(>list_act_srgn, >lh_act_srgn); +} + /* * This function will set up HPB read command using host-side L2P map data. */ @@ -600,12 +617,44 @@ int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) ufshpb_set_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, transfer_len); spin_unlock_irqrestore(>rgn_state_lock, flags); + + if (hpb->is_hcm) { + spin_lock_irqsave(>rgn_lock, flags); + rgn->reads = 0; + spin_unlock_irqrestore(>rgn_lock, flags); + } + return 0; } if (!ufshpb_is_support_chunk(hpb, transfer_len)) return 0; + if (hpb->is_hcm) { + bool activate = false; + /* +* in host control mode, reads are the main source for +* activation trials. +*/ + spin_lock_irqsave(>rgn_lock, flags); + rgn->reads++; + if (rgn->reads == ACTIVATION_THRESHOLD) + activate = true; + spin_unlock_irqrestore(>rgn_lock, flags); + if (activate) { + spin_lock_irqsave(>rsp_list_lock, flags); + ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); + hpb->stats.rb_active_cnt++; + spin_unlock_irqrestore(>rsp_list_lock, flags); + dev_dbg(>sdev_ufs_lu->sdev_dev, + "activate region %d-%d\n", rgn_idx, srgn_idx); + } + + /* keep those counters normalized */ + if (rgn->reads > hpb->entries_per_srgn) + schedule_work(>ufshpb_normalization_work); + } + spin_lock_irqsave(>rgn_state_lock, flags); if (ufshpb_test_ppn_dirty(hpb, rgn_idx, srgn_idx, srgn_offset, transfer_len)) { @@ -745,21 +794,6 @@ static int ufshpb_clear_dirty_bitmap(struct ufshpb_lu *hpb, return 0; } -static void ufshpb_update_active_info(struct ufshpb_lu *hpb, int rgn_idx, - int srgn_idx) -{ - struct ufshpb_region *rgn; - struct ufshpb_subregion *srgn; - - rgn = hpb->rgn_tbl + rgn_idx; - srgn = rgn->srgn_tbl + srgn_idx; - - list_del_init(>list_inact_rgn); - - if (list_empty(>list_act_srgn)) - list_add_tail(>list_act_srgn, >lh_act_srgn); -} - static void ufshpb_update_inactive_info(struct ufshpb_lu *hpb, int rgn_idx) { struct ufshpb_region *rgn; @@ -1079,6 +1113,14 @@ static void __ufshpb_evict_region(s
Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support
Hi Daejun, On 2021-03-09 09:38, Can Guo wrote: Hi Daejun, If you are about to push Ver.27, please hold on. I run into OCP issues on VCCQ every time after apply this patch. The issue can be work around by disabling runtime PM. Before you or we figure out where the BUG is, it is pointless to push next version. Regards, Can Guo. Somehow my setup cannot replicate the issue anymore today, weird... I won't spend much time testing device control mode since now, please go ahead if you have the next version ready. Can Guo. On 2021-03-03 14:29, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- Documentation/ABI/testing/sysfs-driver-ufs | 35 +- drivers/scsi/ufs/ufs-sysfs.c | 2 + drivers/scsi/ufs/ufs.h | 3 +- drivers/scsi/ufs/ufshcd.c | 22 +- drivers/scsi/ufs/ufshcd.h | 7 + drivers/scsi/ufs/ufshpb.c | 622 +++-- drivers/scsi/ufs/ufshpb.h | 67 ++- 7 files changed, 679 insertions(+), 79 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index bf5cb8846de1..0017eaf89cbe 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB pinned regions assigned to The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of reads that changed to HPB read. The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of reads that cannot be changed to @@ -1268,7 +1268,7 @@ Description: This entry shows the number of reads that cannot be changed to The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of response UPIUs that has @@ -1276,7 +1276,7 @@ Description: This entry shows the number of response UPIUs that has The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of active sub-regions recommended by @@ -1284,7 +1284,7 @@ Description: This entry shows the number of active sub-regions recommended by The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of inactive regions recommended by @@ -1292,10 +1292,33 @@ Description:This entry shows the number of inactive regions recommended by The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of read buffer commands for activating sub-regions recommended by response UPIUs. The file is read only. + +What: /sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms +Date: February 2021 +Contact: Daejun Park +Description: This entry shows the requeue timeout threshold for write buffer + command in ms. This value can be changed by writing proper integer to + this entry. + +What: /sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd +Date: February 2021 +Contact: Daejun Park +Description: This entry shows the maximum HPB data size for using single HPB + comm
Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support
Hi Daejun, If you are about to push Ver.27, please hold on. I run into OCP issues on VCCQ every time after apply this patch. The issue can be work around by disabling runtime PM. Before you or we figure out where the BUG is, it is pointless to push next version. Regards, Can Guo. On 2021-03-03 14:29, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- Documentation/ABI/testing/sysfs-driver-ufs | 35 +- drivers/scsi/ufs/ufs-sysfs.c | 2 + drivers/scsi/ufs/ufs.h | 3 +- drivers/scsi/ufs/ufshcd.c | 22 +- drivers/scsi/ufs/ufshcd.h | 7 + drivers/scsi/ufs/ufshpb.c | 622 +++-- drivers/scsi/ufs/ufshpb.h | 67 ++- 7 files changed, 679 insertions(+), 79 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index bf5cb8846de1..0017eaf89cbe 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB pinned regions assigned to The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of reads that changed to HPB read. The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of reads that cannot be changed to @@ -1268,7 +1268,7 @@ Description: This entry shows the number of reads that cannot be changed to The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of response UPIUs that has @@ -1276,7 +1276,7 @@ Description: This entry shows the number of response UPIUs that has The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of active sub-regions recommended by @@ -1284,7 +1284,7 @@ Description: This entry shows the number of active sub-regions recommended by The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of inactive regions recommended by @@ -1292,10 +1292,33 @@ Description:This entry shows the number of inactive regions recommended by The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of read buffer commands for activating sub-regions recommended by response UPIUs. The file is read only. + +What: /sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms +Date: February 2021 +Contact: Daejun Park +Description: This entry shows the requeue timeout threshold for write buffer + command in ms. This value can be changed by writing proper integer to + this entry. + +What: /sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd +Date: February 2021 +Contact: Daejun Park +Description: This entry shows the maximum HPB data size for using single HPB + command. + + === + 00h 4KB + 01h 8KB + 02h 12KB + ... + FFh 1024KB + === + + The file is read only. diff --git a/drivers/scsi/ufs/
Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-03-03 22:50, Bean Huo wrote: On Wed, 2021-03-03 at 15:29 +0900, Daejun Park wrote: + +static inline void ufshpb_put_pre_req(struct ufshpb_lu *hpb, + struct ufshpb_req *pre_req) +{ + pre_req->req = NULL; + pre_req->bio = NULL; + list_add_tail(_req->list_req, >lh_pre_req_free); + hpb->num_inflight_pre_req--; +} + +static void ufshpb_pre_req_compl_fn(struct request *req, blk_status_t error) +{ + struct ufshpb_req *pre_req = (struct ufshpb_req *)req- >end_io_data; + struct ufshpb_lu *hpb = pre_req->hpb; + unsigned long flags; + struct scsi_sense_hdr sshdr; + + if (error) { + dev_err(>sdev_ufs_lu->sdev_dev, "block status %d", error); + scsi_normalize_sense(pre_req->sense, SCSI_SENSE_BUFFERSIZE, +); + dev_err(>sdev_ufs_lu->sdev_dev, + "code %x sense_key %x asc %x ascq %x", + sshdr.response_code, + sshdr.sense_key, sshdr.asc, sshdr.ascq); + dev_err(>sdev_ufs_lu->sdev_dev, + "byte4 %x byte5 %x byte6 %x additional_len %x", + sshdr.byte4, sshdr.byte5, + sshdr.byte6, sshdr.additional_length); + } How can you print out sense_key and sense code here? sense code will not be copied to pre_req->sense. you should directly use scsi_request->sense or let pre_req->sense point to scsi_request->sense. You update the new version patch so quickly. In another word, I am wondering if you tested your patch before submitting? Bean Bean is right about the sense buffer...
Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support
ame) UFS_ATTRIBUTE(boot_lun_enabled, _BOOT_LU_EN); +UFS_ATTRIBUTE(max_data_size_hpb_single_cmd, _MAX_HPB_SINGLE_CMD); UFS_ATTRIBUTE(current_power_mode, _POWER_MODE); UFS_ATTRIBUTE(active_icc_level, _ACTIVE_ICC_LVL); UFS_ATTRIBUTE(ooo_data_enabled, _OOO_DATA_EN); @@ -864,6 +865,7 @@ UFS_ATTRIBUTE(wb_cur_buf, _CURR_WB_BUFF_SIZE); static struct attribute *ufs_sysfs_attributes[] = { _attr_boot_lun_enabled.attr, + _attr_max_data_size_hpb_single_cmd.attr, _attr_current_power_mode.attr, _attr_active_icc_level.attr, _attr_ooo_data_enabled.attr, diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 957763db1006..e0b748777a1b 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -123,12 +123,13 @@ enum flag_idn { QUERY_FLAG_IDN_WB_BUFF_FLUSH_EN = 0x0F, QUERY_FLAG_IDN_WB_BUFF_FLUSH_DURING_HIBERN8 = 0x10, QUERY_FLAG_IDN_HPB_RESET= 0x11, + QUERY_FLAG_IDN_HPB_EN = 0x12, Also add this flag to sysfs? Thanks, Can Guo. }; /* Attribute idn for Query requests */ enum attr_idn { QUERY_ATTR_IDN_BOOT_LU_EN = 0x00, - QUERY_ATTR_IDN_RESERVED = 0x01, + QUERY_ATTR_IDN_MAX_HPB_SINGLE_CMD = 0x01, QUERY_ATTR_IDN_POWER_MODE = 0x02, QUERY_ATTR_IDN_ACTIVE_ICC_LVL = 0x03, QUERY_ATTR_IDN_OOO_DATA_EN = 0x04, diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 851c01a26207..f7e491ad4fa8 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2656,7 +2656,12 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) lrbp->req_abort_skip = false; - ufshpb_prep(hba, lrbp); + err = ufshpb_prep(hba, lrbp); + if (err == -EAGAIN) { + lrbp->cmd = NULL; + ufshcd_release(hba); + goto out; + } ufshcd_comp_scsi_upiu(hba, lrbp); @@ -3110,7 +3115,7 @@ int ufshcd_query_attr(struct ufs_hba *hba, enum query_opcode opcode, * * Returns 0 for success, non-zero in case of failure */ -static int ufshcd_query_attr_retry(struct ufs_hba *hba, +int ufshcd_query_attr_retry(struct ufs_hba *hba, enum query_opcode opcode, enum attr_idn idn, u8 index, u8 selector, u32 *attr_val) { @@ -7447,8 +7452,18 @@ static int ufs_get_device_desc(struct ufs_hba *hba) if (dev_info->wspecversion >= UFS_DEV_HPB_SUPPORT_VERSION && (b_ufs_feature_sup & UFS_DEV_HPB_SUPPORT)) { - dev_info->hpb_enabled = true; + bool hpb_en = false; + ufshpb_get_dev_info(hba, desc_buf); + + if (!ufshpb_is_legacy(hba)) + err = ufshcd_query_flag_retry(hba, + UPIU_QUERY_OPCODE_READ_FLAG, + QUERY_FLAG_IDN_HPB_EN, 0, + _en); + + if (ufshpb_is_legacy(hba) || (!err && hpb_en)) + dev_info->hpb_enabled = true; } err = ufshcd_read_string_desc(hba, model_index, @@ -8019,6 +8034,7 @@ static const struct attribute_group *ufshcd_driver_groups[] = { _sysfs_lun_attributes_group, #ifdef CONFIG_SCSI_UFS_HPB _sysfs_hpb_stat_group, + _sysfs_hpb_param_group, #endif NULL, }; diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 961fc5b77943..7d85517464ef 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -654,6 +654,8 @@ struct ufs_hba_variant_params { * @srgn_size: device reported HPB sub-region size * @slave_conf_cnt: counter to check all lu finished initialization * @hpb_disabled: flag to check if HPB is disabled + * @max_hpb_single_cmd: maximum size of single HPB command + * @is_legacy: flag to check HPB 1.0 */ struct ufshpb_dev_info { int num_lu; @@ -661,6 +663,8 @@ struct ufshpb_dev_info { int srgn_size; atomic_t slave_conf_cnt; bool hpb_disabled; + int max_hpb_single_cmd; + bool is_legacy; }; #endif @@ -1091,6 +1095,9 @@ int ufshcd_read_desc_param(struct ufs_hba *hba, u8 param_offset, u8 *param_read_buf, u8 param_size); +int ufshcd_query_attr_retry(struct ufs_hba *hba, enum query_opcode opcode, + enum attr_idn idn, u8 index, u8 selector, + u32 *attr_val); int ufshcd_query_attr(struct ufs_hba *hba, enum query_opcode opcode, enum attr_idn idn, u8 index, u8 selector, u32 *attr_val); int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode, diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index c75a6816a03f..ae092a4420
Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support
- if (!ufshpb_is_support_chunk(transfer_len)) - return; + if (!ufshpb_is_support_chunk(hpb, transfer_len) && + (ufshpb_is_legacy(hba) && (transfer_len != HPB_LEGACY_CHUNK_HIGH))) + return 0; This is looks awkward, can we put the checks in ufshpb_is_support_chunk()? Thanks, Can Guo.
Re: [PATCH] scsi: ufs: Fix incorrect ufshcd_state after ufshcd_reset_and_restore()
On 2021-03-02 03:19, Adrian Hunter wrote: If ufshcd_probe_hba() fails it sets ufshcd_state to UFSHCD_STATE_ERROR, however, if it is called again, as it is within a loop in ufshcd_reset_and_restore(), and succeeds, then it will not set the state back to UFSHCD_STATE_OPERATIONAL unless the state was UFSHCD_STATE_RESET. That can result in the state being UFSHCD_STATE_ERROR even though ufshcd_reset_and_restore() is successful and returns zero. Fix by initializing the state to UFSHCD_STATE_RESET in the start of each loop in ufshcd_reset_and_restore(). If there is an error, ufshcd_reset_and_restore() will change the state to UFSHCD_STATE_ERROR, otherwise ufshcd_probe_hba() will have set the state appropriately. Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths") Signed-off-by: Adrian Hunter --- drivers/scsi/ufs/ufshcd.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 77161750c9fb..91a403afe038 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -7031,6 +7031,8 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba) spin_unlock_irqrestore(hba->host->host_lock, flags); do { + hba->ufshcd_state = UFSHCD_STATE_RESET; + /* Reset the attached device */ ufshcd_device_reset(hba); Hi Adrian, I've proposed a fix to get it addressed - https://lore.kernel.org/patchwork/patch/1383817/ Thanks, Can Guo.
Re: [PATCH] scsi: ufs: Fix incorrect ufshcd_state after ufshcd_reset_and_restore()
On 2021-03-02 16:14, Adrian Hunter wrote: On 2/03/21 9:01 am, Avri Altman wrote: If ufshcd_probe_hba() fails it sets ufshcd_state to UFSHCD_STATE_ERROR, however, if it is called again, as it is within a loop in ufshcd_reset_and_restore(), and succeeds, then it will not set the state back to UFSHCD_STATE_OPERATIONAL unless the state was UFSHCD_STATE_RESET. That can result in the state being UFSHCD_STATE_ERROR even though ufshcd_reset_and_restore() is successful and returns zero. Fix by initializing the state to UFSHCD_STATE_RESET in the start of each loop in ufshcd_reset_and_restore(). If there is an error, ufshcd_reset_and_restore() will change the state to UFSHCD_STATE_ERROR, otherwise ufshcd_probe_hba() will have set the state appropriately. Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths") Signed-off-by: Adrian Hunter I think that CanG recent series addressed that issue as well, can you take a look? https://lore.kernel.org/lkml/1614145010-36079-2-git-send-email-c...@codeaurora.org/ Yes, there it is mixed in with other changes. However it is probably better as a separate patch. Can Guo, what do you think? Oh, I missed this one... Sure, I will split it out as a seperate change in next version. Thanks, Can Guo. --- drivers/scsi/ufs/ufshcd.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 77161750c9fb..91a403afe038 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -7031,6 +7031,8 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba) spin_unlock_irqrestore(hba->host->host_lock, flags); do { + hba->ufshcd_state = UFSHCD_STATE_RESET; + /* Reset the attached device */ ufshcd_device_reset(hba); -- 2.17.1
Re: [PATCH v2 1/3] scsi: ufs: Minor adjustments to error handling
On 2021-03-03 18:03, Can Guo wrote: Hi Avri, On 2021-03-03 15:22, Avri Altman wrote: In error handling prepare stage, after SCSI requests are blocked, do a down/up_write(clk_scaling_lock) to clean up the queuecommand() path. Meanwhile, stop eeh_work in case it disturbs error recovery. Moreover, reset ufshcd_state at the entrance of ufshcd_probe_hba(), since it may be called multiple times during error recovery. Signed-off-by: Can Guo I noticed that you tagged Adrian's patch - https://lore.kernel.org/lkml/20210301191940.15247-1-adrian.hun...@intel.com/ So this patch needs to be adjusted accordingly? Thanks for pointing me to that one, I will rebase mine. Regards, Can Guo. Just noticed that Adrian's change comes later than mine, so I may not need to adjust mine. Thanks, Can Guo. Thanks, Avri --- drivers/scsi/ufs/ufshcd.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 80620c8..013eb73 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4987,6 +4987,7 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) * UFS device needs urgent BKOPs. */ if (!hba->pm_op_in_progress && + !ufshcd_eh_in_progress(hba) && ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) && schedule_work(>eeh_work)) { /* @@ -5784,13 +5785,20 @@ static void ufshcd_err_handling_prepare(struct ufs_hba *hba) ufshcd_suspend_clkscaling(hba); ufshcd_clk_scaling_allow(hba, false); } + ufshcd_scsi_block_requests(hba); + /* Drain ufshcd_queuecommand() */ + down_write(>clk_scaling_lock); + up_write(>clk_scaling_lock); + cancel_work_sync(>eeh_work); } static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) { + ufshcd_scsi_unblock_requests(hba); ufshcd_release(hba); if (ufshcd_is_clkscaling_supported(hba)) ufshcd_clk_scaling_suspend(hba, false); + ufshcd_clear_ua_wluns(hba); pm_runtime_put(hba->dev); } @@ -5882,8 +5890,8 @@ static void ufshcd_err_handler(struct work_struct *work) spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_err_handling_prepare(hba); spin_lock_irqsave(hba->host->host_lock, flags); - ufshcd_scsi_block_requests(hba); - hba->ufshcd_state = UFSHCD_STATE_RESET; + if (hba->ufshcd_state != UFSHCD_STATE_ERROR) + hba->ufshcd_state = UFSHCD_STATE_RESET; /* Complete requests that have door-bell cleared by h/w */ ufshcd_complete_requests(hba); @@ -6042,12 +6050,8 @@ static void ufshcd_err_handler(struct work_struct *work) } ufshcd_clear_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); - ufshcd_scsi_unblock_requests(hba); ufshcd_err_handling_unprepare(hba); up(>host_sem); - - if (!err && needs_reset) - ufshcd_clear_ua_wluns(hba); } /** @@ -7858,6 +7862,8 @@ static int ufshcd_probe_hba(struct ufs_hba *hba, bool async) unsigned long flags; ktime_t start = ktime_get(); + hba->ufshcd_state = UFSHCD_STATE_RESET; + ret = ufshcd_link_startup(hba); if (ret) goto out; -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v2 1/3] scsi: ufs: Minor adjustments to error handling
Hi Avri, On 2021-03-03 15:22, Avri Altman wrote: In error handling prepare stage, after SCSI requests are blocked, do a down/up_write(clk_scaling_lock) to clean up the queuecommand() path. Meanwhile, stop eeh_work in case it disturbs error recovery. Moreover, reset ufshcd_state at the entrance of ufshcd_probe_hba(), since it may be called multiple times during error recovery. Signed-off-by: Can Guo I noticed that you tagged Adrian's patch - https://lore.kernel.org/lkml/20210301191940.15247-1-adrian.hun...@intel.com/ So this patch needs to be adjusted accordingly? Thanks for pointing me to that one, I will rebase mine. Regards, Can Guo. Thanks, Avri --- drivers/scsi/ufs/ufshcd.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 80620c8..013eb73 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4987,6 +4987,7 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) * UFS device needs urgent BKOPs. */ if (!hba->pm_op_in_progress && + !ufshcd_eh_in_progress(hba) && ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) && schedule_work(>eeh_work)) { /* @@ -5784,13 +5785,20 @@ static void ufshcd_err_handling_prepare(struct ufs_hba *hba) ufshcd_suspend_clkscaling(hba); ufshcd_clk_scaling_allow(hba, false); } + ufshcd_scsi_block_requests(hba); + /* Drain ufshcd_queuecommand() */ + down_write(>clk_scaling_lock); + up_write(>clk_scaling_lock); + cancel_work_sync(>eeh_work); } static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) { + ufshcd_scsi_unblock_requests(hba); ufshcd_release(hba); if (ufshcd_is_clkscaling_supported(hba)) ufshcd_clk_scaling_suspend(hba, false); + ufshcd_clear_ua_wluns(hba); pm_runtime_put(hba->dev); } @@ -5882,8 +5890,8 @@ static void ufshcd_err_handler(struct work_struct *work) spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_err_handling_prepare(hba); spin_lock_irqsave(hba->host->host_lock, flags); - ufshcd_scsi_block_requests(hba); - hba->ufshcd_state = UFSHCD_STATE_RESET; + if (hba->ufshcd_state != UFSHCD_STATE_ERROR) + hba->ufshcd_state = UFSHCD_STATE_RESET; /* Complete requests that have door-bell cleared by h/w */ ufshcd_complete_requests(hba); @@ -6042,12 +6050,8 @@ static void ufshcd_err_handler(struct work_struct *work) } ufshcd_clear_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); - ufshcd_scsi_unblock_requests(hba); ufshcd_err_handling_unprepare(hba); up(>host_sem); - - if (!err && needs_reset) - ufshcd_clear_ua_wluns(hba); } /** @@ -7858,6 +7862,8 @@ static int ufshcd_probe_hba(struct ufs_hba *hba, bool async) unsigned long flags; ktime_t start = ktime_get(); + hba->ufshcd_state = UFSHCD_STATE_RESET; + ret = ufshcd_link_startup(hba); if (ret) goto out; -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v10 2/2] ufs: sysfs: Resume the proper scsi device
On 2021-03-03 06:52, Asutosh Das wrote: Resumes the actual scsi device the unit descriptor of which is being accessed instead of the hba alone. Signed-off-by: Asutosh Das Reviewed-by: Can Guo --- drivers/scsi/ufs/ufs-sysfs.c | 30 +- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index acc54f5..3fc182b 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -245,9 +245,9 @@ static ssize_t wb_on_store(struct device *dev, struct device_attribute *attr, goto out; } - pm_runtime_get_sync(hba->dev); + scsi_autopm_get_device(hba->sdev_ufs_device); res = ufshcd_wb_ctrl(hba, wb_enable); - pm_runtime_put_sync(hba->dev); + scsi_autopm_put_device(hba->sdev_ufs_device); out: up(>host_sem); return res < 0 ? res : count; @@ -297,10 +297,10 @@ static ssize_t ufs_sysfs_read_desc_param(struct ufs_hba *hba, goto out; } - pm_runtime_get_sync(hba->dev); + scsi_autopm_get_device(hba->sdev_ufs_device); ret = ufshcd_read_desc_param(hba, desc_id, desc_index, param_offset, desc_buf, param_size); - pm_runtime_put_sync(hba->dev); + scsi_autopm_put_device(hba->sdev_ufs_device); if (ret) { ret = -EINVAL; goto out; @@ -678,7 +678,7 @@ static ssize_t _name##_show(struct device *dev,\ up(>host_sem); \ return -ENOMEM; \ } \ - pm_runtime_get_sync(hba->dev); \ + scsi_autopm_get_device(hba->sdev_ufs_device);\ ret = ufshcd_query_descriptor_retry(hba,\ UPIU_QUERY_OPCODE_READ_DESC, QUERY_DESC_IDN_DEVICE, \ 0, 0, desc_buf, _len); \ @@ -695,7 +695,7 @@ static ssize_t _name##_show(struct device *dev,\ goto out; \ ret = sysfs_emit(buf, "%s\n", desc_buf); \ out: \ - pm_runtime_put_sync(hba->dev); \ + scsi_autopm_put_device(hba->sdev_ufs_device);\ kfree(desc_buf);\ up(>host_sem); \ return ret; \ @@ -744,10 +744,10 @@ static ssize_t _name##_show(struct device *dev,\ } \ if (ufshcd_is_wb_flags(QUERY_FLAG_IDN##_uname)) \ index = ufshcd_wb_get_query_index(hba); \ - pm_runtime_get_sync(hba->dev); \ + scsi_autopm_get_device(hba->sdev_ufs_device);\ ret = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG, \ QUERY_FLAG_IDN##_uname, index, ); \ - pm_runtime_put_sync(hba->dev); \ + scsi_autopm_put_device(hba->sdev_ufs_device);\ if (ret) { \ ret = -EINVAL; \ goto out; \ @@ -813,10 +813,10 @@ static ssize_t _name##_show(struct device *dev,\ } \ if (ufshcd_is_wb_attrs(QUERY_ATTR_IDN##_uname)) \ index = ufshcd_wb_get_query_index(hba); \ - pm_runtime_get_sync(hba->dev); \ + scsi_autopm_get_device(hba->sdev_ufs_device);\ ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR, \ QUERY_ATTR_IDN##_uname, index, 0, ); \ - pm_runtime_put_sync(hba->dev); \ + scsi_autopm_put_device(hba->sdev_ufs_device);\ if (ret) { \ ret = -EINVAL; \ goto out; \ @@ -899,11 +899,15 @@ static ssize_t _pname##_show(struct device *dev, \ struct scsi_device *sdev = to_scsi_device(dev); \ str
Re: [PATCH v25 4/4] scsi: ufs: Add HPB 2.0 support
On 2021-02-26 15:35, Daejun Park wrote: This patch supports the HPB 2.0. The HPB 2.0 supports read of varying sizes from 4KB to 512KB. In the case of Read (<= 32KB) is supported as single HPB read. In the case of Read (36KB ~ 512KB) is supported by as a combination of write buffer command and HPB read command to deliver more PPN. The write buffer commands may not be issued immediately due to busy tags. To use HPB read more aggressively, the driver can requeue the write buffer command. The requeue threshold is implemented as timeout and can be modified with requeue_timeout_ms entry in sysfs. Signed-off-by: Daejun Park --- Documentation/ABI/testing/sysfs-driver-ufs | 35 +- drivers/scsi/ufs/ufs-sysfs.c | 2 + drivers/scsi/ufs/ufs.h | 3 +- drivers/scsi/ufs/ufshcd.c | 22 +- drivers/scsi/ufs/ufshcd.h | 7 + drivers/scsi/ufs/ufshpb.c | 624 +++-- drivers/scsi/ufs/ufshpb.h | 67 ++- 7 files changed, 681 insertions(+), 79 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs index bf5cb8846de1..0017eaf89cbe 100644 --- a/Documentation/ABI/testing/sysfs-driver-ufs +++ b/Documentation/ABI/testing/sysfs-driver-ufs @@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB pinned regions assigned to The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of reads that changed to HPB read. The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of reads that cannot be changed to @@ -1268,7 +1268,7 @@ Description: This entry shows the number of reads that cannot be changed to The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of response UPIUs that has @@ -1276,7 +1276,7 @@ Description: This entry shows the number of response UPIUs that has The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of active sub-regions recommended by @@ -1284,7 +1284,7 @@ Description: This entry shows the number of active sub-regions recommended by The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of inactive regions recommended by @@ -1292,10 +1292,33 @@ Description:This entry shows the number of inactive regions recommended by The file is read only. -What: /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt +What: /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt Date: February 2021 Contact: Daejun Park Description: This entry shows the number of read buffer commands for activating sub-regions recommended by response UPIUs. The file is read only. + +What: /sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms +Date: February 2021 +Contact: Daejun Park +Description: This entry shows the requeue timeout threshold for write buffer + command in ms. This value can be changed by writing proper integer to + this entry. + +What: /sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd +Date: February 2021 +Contact: Daejun Park +Description: This entry shows the maximum HPB data size for using single HPB + command. + + === + 00h 4KB + 01h 8KB + 02h 12KB + ... + FFh 1024KB + === + + The file is read only. diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c index 2546e7a1ac4f..00fb519406cf 100644 --- a/drivers/scsi/ufs/ufs-sysfs.c +++ b/drivers/scsi/ufs/ufs-sysfs.c @@ -841,6 +841,7 @@ out: \ static DEVICE_ATTR_RO(_name)
Re: [PATCH v2 2/3] scsi: ufs-qcom: Disable interrupt in reset path
On 2021-02-28 22:23, Avri Altman wrote: From: Nitin Rawat Disable interrupt in reset path to flush pending IRQ handler in order to avoid possible NoC issues. Signed-off-by: Nitin Rawat Signed-off-by: Can Guo --- drivers/scsi/ufs/ufs-qcom.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c index f97d7b0..a9dc8d7 100644 --- a/drivers/scsi/ufs/ufs-qcom.c +++ b/drivers/scsi/ufs/ufs-qcom.c @@ -253,12 +253,17 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba) { int ret = 0; struct ufs_qcom_host *host = ufshcd_get_variant(hba); + bool reenable_intr = false; if (!host->core_reset) { dev_warn(hba->dev, "%s: reset control not set\n", __func__); goto out; } + reenable_intr = hba->is_irq_enabled; + disable_irq(hba->irq); + hba->is_irq_enabled = false; + ret = reset_control_assert(host->core_reset); if (ret) { dev_err(hba->dev, "%s: core_reset assert failed, err = %d\n", @@ -280,6 +285,11 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba) usleep_range(1000, 1100); + if (reenable_intr) { + enable_irq(hba->irq); + hba->is_irq_enabled = true; + } + If in the future, you will enable UFSHCI_QUIRK_BROKEN_HCE on your platform (currently only for Exynos), Will this code still work? Yes, it still works. Thanks, Can Guo.
Re: [PATCH v25 4/4] scsi: ufs: Add HPB 2.0 support
drivers/scsi/ufs/ufshpb.c @@ -31,6 +31,11 @@ bool ufshpb_is_allowed(struct ufs_hba *hba) return !(hba->ufshpb_dev.hpb_disabled); } +bool ufshpb_is_legacy(struct ufs_hba *hba) +{ + return hba->ufshpb_dev.is_legacy; +} + static struct ufshpb_lu *ufshpb_get_hpb_data(struct scsi_device *sdev) { return sdev->hostdata; @@ -64,9 +69,19 @@ static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd) op_is_discard(req_op(cmd->request)); } -static bool ufshpb_is_support_chunk(int transfer_len) +static bool ufshpb_is_support_chunk(struct ufshpb_lu *hpb, int transfer_len) { - return transfer_len <= HPB_MULTI_CHUNK_HIGH; + return transfer_len <= hpb->pre_req_max_tr_len; In the case of HPB1.0, this is wrong - you are allowing transfer_len > 1 for HPB1.0 devices. Can Guo. +} + +/* + * In this driver, WRITE_BUFFER CMD support 36KB (len=9) ~ 512KB (len=128) as + * default. It is possible to change range of transfer_len through sysfs. + */ +static inline bool ufshpb_is_required_wb(struct ufshpb_lu *hpb, int len) +{ + return (len >= hpb->pre_req_min_tr_len && + len <= hpb->pre_req_max_tr_len); } static bool ufshpb_is_general_lun(int lun) @@ -74,8 +89,7 @@ static bool ufshpb_is_general_lun(int lun) return lun < UFS_UPIU_MAX_UNIT_NUM_ID; } -static bool -ufshpb_is_pinned_region(struct ufshpb_lu *hpb, int rgn_idx) +static bool ufshpb_is_pinned_region(struct ufshpb_lu *hpb, int rgn_idx) { if (hpb->lu_pinned_end != PINNED_NOT_SET && rgn_idx >= hpb->lu_pinned_start && @@ -264,7 +278,8 @@ ufshpb_get_pos_from_lpn(struct ufshpb_lu *hpb, unsigned long lpn, int *rgn_idx, static void ufshpb_set_hpb_read_to_upiu(struct ufshpb_lu *hpb, struct ufshcd_lrb *lrbp, - u32 lpn, u64 ppn, unsigned int transfer_len) + u32 lpn, u64 ppn, unsigned int transfer_len, + int read_id) { unsigned char *cdb = lrbp->cmd->cmnd; @@ -273,15 +288,269 @@ ufshpb_set_hpb_read_to_upiu(struct ufshpb_lu *hpb, struct ufshcd_lrb *lrbp, /* ppn value is stored as big-endian in the host memory */ memcpy([6], , sizeof(u64)); cdb[14] = transfer_len; + cdb[15] = read_id; lrbp->cmd->cmd_len = UFS_CDB_SIZE; } +static inline void ufshpb_set_write_buf_cmd(unsigned char *cdb, + unsigned long lpn, unsigned int len, + int read_id) +{ + cdb[0] = UFSHPB_WRITE_BUFFER; + cdb[1] = UFSHPB_WRITE_BUFFER_PREFETCH_ID; + + put_unaligned_be32(lpn, [2]); + cdb[6] = read_id; + put_unaligned_be16(len * HPB_ENTRY_SIZE, [7]); + + cdb[9] = 0x00; /* Control = 0x00 */ +} + +static struct ufshpb_req *ufshpb_get_pre_req(struct ufshpb_lu *hpb) +{ + struct ufshpb_req *pre_req; + + if (hpb->num_inflight_pre_req >= hpb->throttle_pre_req) { + dev_info(>sdev_ufs_lu->sdev_dev, +"pre_req throttle. inflight %d throttle %d", +hpb->num_inflight_pre_req, hpb->throttle_pre_req); + return NULL; + } + + pre_req = list_first_entry_or_null(>lh_pre_req_free, + struct ufshpb_req, list_req); + if (!pre_req) { + dev_info(>sdev_ufs_lu->sdev_dev, "There is no pre_req"); + return NULL; + } + + list_del_init(_req->list_req); + hpb->num_inflight_pre_req++; + + return pre_req; +} + +static inline void ufshpb_put_pre_req(struct ufshpb_lu *hpb, + struct ufshpb_req *pre_req) +{ + pre_req->req = NULL; + pre_req->bio = NULL; + list_add_tail(_req->list_req, >lh_pre_req_free); + hpb->num_inflight_pre_req--; +} + +static void ufshpb_pre_req_compl_fn(struct request *req, blk_status_t error) +{ + struct ufshpb_req *pre_req = (struct ufshpb_req *)req->end_io_data; + struct ufshpb_lu *hpb = pre_req->hpb; + unsigned long flags; + struct scsi_sense_hdr sshdr; + + if (error) { + dev_err(>sdev_ufs_lu->sdev_dev, "block status %d", error); + scsi_normalize_sense(pre_req->sense, SCSI_SENSE_BUFFERSIZE, +); + dev_err(>sdev_ufs_lu->sdev_dev, + "code %x sense_key %x asc %x ascq %x", + sshdr.response_code, + sshdr.sense_key, sshdr.asc, sshdr.ascq); + dev_err(>sdev_ufs_lu->sdev_dev, + "byte4 %x byte5 %x byte6 %x additional_len %x", + sshdr.byte4, sshdr.byte5, + sshdr.byte6
[PATCH v2 2/3] scsi: ufs-qcom: Disable interrupt in reset path
From: Nitin Rawat Disable interrupt in reset path to flush pending IRQ handler in order to avoid possible NoC issues. Signed-off-by: Nitin Rawat Signed-off-by: Can Guo --- drivers/scsi/ufs/ufs-qcom.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c index f97d7b0..a9dc8d7 100644 --- a/drivers/scsi/ufs/ufs-qcom.c +++ b/drivers/scsi/ufs/ufs-qcom.c @@ -253,12 +253,17 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba) { int ret = 0; struct ufs_qcom_host *host = ufshcd_get_variant(hba); + bool reenable_intr = false; if (!host->core_reset) { dev_warn(hba->dev, "%s: reset control not set\n", __func__); goto out; } + reenable_intr = hba->is_irq_enabled; + disable_irq(hba->irq); + hba->is_irq_enabled = false; + ret = reset_control_assert(host->core_reset); if (ret) { dev_err(hba->dev, "%s: core_reset assert failed, err = %d\n", @@ -280,6 +285,11 @@ static int ufs_qcom_host_reset(struct ufs_hba *hba) usleep_range(1000, 1100); + if (reenable_intr) { + enable_irq(hba->irq); + hba->is_irq_enabled = true; + } + out: return ret; } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH v2 1/3] scsi: ufs: Minor adjustments to error handling
In error handling prepare stage, after SCSI requests are blocked, do a down/up_write(clk_scaling_lock) to clean up the queuecommand() path. Meanwhile, stop eeh_work in case it disturbs error recovery. Moreover, reset ufshcd_state at the entrance of ufshcd_probe_hba(), since it may be called multiple times during error recovery. Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 80620c8..013eb73 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4987,6 +4987,7 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) * UFS device needs urgent BKOPs. */ if (!hba->pm_op_in_progress && + !ufshcd_eh_in_progress(hba) && ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) && schedule_work(>eeh_work)) { /* @@ -5784,13 +5785,20 @@ static void ufshcd_err_handling_prepare(struct ufs_hba *hba) ufshcd_suspend_clkscaling(hba); ufshcd_clk_scaling_allow(hba, false); } + ufshcd_scsi_block_requests(hba); + /* Drain ufshcd_queuecommand() */ + down_write(>clk_scaling_lock); + up_write(>clk_scaling_lock); + cancel_work_sync(>eeh_work); } static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) { + ufshcd_scsi_unblock_requests(hba); ufshcd_release(hba); if (ufshcd_is_clkscaling_supported(hba)) ufshcd_clk_scaling_suspend(hba, false); + ufshcd_clear_ua_wluns(hba); pm_runtime_put(hba->dev); } @@ -5882,8 +5890,8 @@ static void ufshcd_err_handler(struct work_struct *work) spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_err_handling_prepare(hba); spin_lock_irqsave(hba->host->host_lock, flags); - ufshcd_scsi_block_requests(hba); - hba->ufshcd_state = UFSHCD_STATE_RESET; + if (hba->ufshcd_state != UFSHCD_STATE_ERROR) + hba->ufshcd_state = UFSHCD_STATE_RESET; /* Complete requests that have door-bell cleared by h/w */ ufshcd_complete_requests(hba); @@ -6042,12 +6050,8 @@ static void ufshcd_err_handler(struct work_struct *work) } ufshcd_clear_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); - ufshcd_scsi_unblock_requests(hba); ufshcd_err_handling_unprepare(hba); up(>host_sem); - - if (!err && needs_reset) - ufshcd_clear_ua_wluns(hba); } /** @@ -7858,6 +7862,8 @@ static int ufshcd_probe_hba(struct ufs_hba *hba, bool async) unsigned long flags; ktime_t start = ktime_get(); + hba->ufshcd_state = UFSHCD_STATE_RESET; + ret = ufshcd_link_startup(hba); if (ret) goto out; -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH v2 3/3] scsi: ufs: Remove redundant checks of !hba in suspend/resume callbacks
Runtime and system suspend/resume can only come after hba probe invokes platform_set_drvdata(pdev, hba), meaning hba cannot be NULL in these PM callbacks, so remove the checks of !hba. Signed-off-by: Can Guo --- drivers/scsi/ufs/ufshcd.c | 21 - 1 file changed, 21 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 013eb73..2517ef1 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -95,8 +95,6 @@ 16, 4, buf, __len, false);\ } while (0) -static bool early_suspend; - int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len, const char *prefix) { @@ -8978,11 +8976,6 @@ int ufshcd_system_suspend(struct ufs_hba *hba) int ret = 0; ktime_t start = ktime_get(); - if (!hba) { - early_suspend = true; - return 0; - } - down(>host_sem); if (!hba->is_powered) @@ -9034,14 +9027,6 @@ int ufshcd_system_resume(struct ufs_hba *hba) int ret = 0; ktime_t start = ktime_get(); - if (!hba) - return -EINVAL; - - if (unlikely(early_suspend)) { - early_suspend = false; - down(>host_sem); - } - if (!hba->is_powered || pm_runtime_suspended(hba->dev)) /* * Let the runtime resume take care of resuming @@ -9074,9 +9059,6 @@ int ufshcd_runtime_suspend(struct ufs_hba *hba) int ret = 0; ktime_t start = ktime_get(); - if (!hba) - return -EINVAL; - if (!hba->is_powered) goto out; else @@ -9115,9 +9097,6 @@ int ufshcd_runtime_resume(struct ufs_hba *hba) int ret = 0; ktime_t start = ktime_get(); - if (!hba) - return -EINVAL; - if (!hba->is_powered) goto out; else -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v19 3/3] scsi: ufs: Prepare HPB read for cached sub-region
On 2021-02-09 22:21, Bean Huo wrote: On Tue, 2021-02-09 at 13:25 +, Avri Altman wrote: > > > > > > + put_unaligned_be64(ppn, [6]); > > > > > > You are assuming the HPB entries read out by "HPB Read Buffer" > > > cmd > > > are > > > in Little > > > Endian, which is why you are using put_unaligned_be64 here. > > > However, > > > this assumption > > > is not right for all the other flash vendors - HPB entries read > > > out > > > by > > > "HPB Read Buffer" > > > cmd may come in Big Endian, if so, their random read > > > performance are > > > screwed. > > > > For this question, it is very hard to make a correct format since > > the > > Spec doesn't give a clear definition. Should we have a default > > format, > > if there is conflict, and then add quirk or add a vendor-specific > > table? > > > > Hi Avri > > Do you have a good idea? > > I don't know. Better let Daejun answer this. > This was working for me for both Galaxy S20 (Exynos) as well as > Xiaomi Mi10 > (8250). As for the endianity issue - I don't think that any fix is needed in the hpb driver. It is readily seen that the ppn from get_ppn, and the one in the upiu cdb (upiu trace) are identical. Therefore, if an issue exist, it is IMHO a device issue. kworker/u16:10-315 [001] d..262.283264: ufshpb_get_ppn: Avri ppn 480d2f8244c21abd kworker/u16:10-315 [001] d..262.283336: ufshcd_upiu: v:1.10 send: T:62283314922, HDR:0140, CDB:882ddaac480d2f8244c21abd0100, D: Again, verified on both gs20 (exynos) and mi10 (8250). Thanks, Avri Hi Avri, Your testing method is no problem, the current problem is in function ufshpb_get_ppn(). +static u64 ufshpb_get_ppn(struct ufshpb_lu *hpb, + struct ufshpb_map_ctx *mctx, int pos, int *error) +{ + u64 *ppn_table; + struct page *page; + int index, offset; + + index = pos / (PAGE_SIZE / HPB_ENTRY_SIZE); + offset = pos % (PAGE_SIZE / HPB_ENTRY_SIZE); + + page = mctx->m_page[index]; + if (unlikely(!page)) { + *error = -ENOMEM; + dev_err(>sdev_ufs_lu->sdev_dev, + "error. cannot find page in mctx\n"); + return 0; + } + + ppn_table = page_address(page); + if (unlikely(!ppn_table)) { + *error = -ENOMEM; + dev_err(>sdev_ufs_lu->sdev_dev, + "error. cannot get ppn_table\n"); + return 0; + } + + return ppn_table[offset]; +} Say, the UFS device outputs the L2P entry in big-endian, which means the most significant byte of an L2P entry will be output firstly, then the less significant byte..., let's take an example of one L2P entry: 0x 12 34 56 78 90 12 34 56 0x12 is the most significant byte, will be store in the lowest address in the L2P cache. eg, F008: 1234 5678 9012 3456 In the ARM based system, If we use "return ppn_table[offset]", the original L2P entry 0x1234 5678 9012 3456, will be converted to 0x5634 1290 7856 3412. then use put_unaligned_be64(), UFS receive unexpected L2P entry(L2P entry miss). If the UFS output L2P entry in the big-endian, this is a problem. For the UFS outputs L2P entry in little-endian, no problem, Because of the L2P entry in the memory: F008: 5634 1290 7856 3412 After return ppn_table[offset], L2P entry will be correct L2P entry: 0x1234567890123456. then use put_unaligned_be64(), UFS can receive expected L2P etnry(L2P entry hit). we need to figure out which way is the JEDEC recommended L2P entry output endianness. otherwise, two methods co-exist in HPB driver, there will confuse customer. If you have a look at the JEDEC HPB 2.0, seems the big-endian is correct way. This need you and Daejun to double check inside your company. Bean is right, finally you know what I was saying... We need to fix it before move on - all the UFS3.1 HPB parts which I tested over the last few weeks are screwed due to this... I don't care where/how you want to get it fixed in next version. In my case, which may not be a valid fix, I simply hack the code as below and it works for me. - put_unaligned_be64(ppn, [6]); + memcpy([6], , sizeof(u64)); Thanks, Can Guo. thanks, Bean
Re: [PATCH v19 2/3] scsi: ufs: L2P map management for HPB read
On 2021-02-09 09:27, Daejun Park wrote: @@ -342,13 +1208,14 @@ void ufshpb_suspend(struct ufs_hba *hba) > struct scsi_device *sdev; > > shost_for_each_device(sdev, hba->host) { > -hpb = sdev->hostdata; > +hpb = ufshpb_get_hpb_data(sdev); > if (!hpb) > continue; > > if (ufshpb_get_state(hpb) != HPB_PRESENT) > continue; > ufshpb_set_state(hpb, HPB_SUSPEND); > +ufshpb_cancel_jobs(hpb); Here may have a dead lock problem - in the case of runtime suspend, when ufshpb_suspend() is invoked, all of hba's children scsi devices are in RPM_SUSPENDED state. When this line tries to cancel a running map work, i.e. when ufshpb_get_map_req() calls below lines, it will be stuck at blk_queue_enter(). req = blk_get_request(hpb->sdev_ufs_lu->request_queue, REQ_OP_SCSI_IN, 0); Please check block layer power management, and see also commit d55d15a33 ("scsi: block: Do not accept any requests while suspended"). I am agree with your comment. How about add BLK_MQ_REQ_NOWAIT flag on blk_get_request() to avoid hang? That won't work - BLK_MQ_REQ_NOWAIT allows one to fast fail from blk_mq_get_tag(), but blk_queue_enter() comes before __blk_mq_alloc_request(); In blk_queue_enter(), BLK_MQ_REQ_NOWAIT flag can make error than wait rpm resume. Please refer following code. Oops, sorry, my memory needs to be refreshed on that part. But will BLK_MQ_REQ_NOWAIT flag breaks your original purpose? When runtime suspend is out of the picture, if traffic is heavy on the request queue, map_work() will be stopped frequently once it is not able to get a request from the queue - that shall pull down the efficiency of one map_work(), that may hurt random performance... I think deadlock prevention is the most important. So I want to add BLK_MQ_REQ_NOWAIT flag. Starvation of map request can be distinguish by return value of blk_get_request(). -EWOULDBLOCK means there is no available tags for this request. -EBUSY means failed on blk_queue_enter(). To overcome starvation of map request, we can try N times in heavy traffic situation (maybe N=3?). LGTM. You make the call. Regards, Can Guo. Thanks, Daejun
Re: [PATCH v19 3/3] scsi: ufs: Prepare HPB read for cached sub-region
On 2021-02-08 16:16, Bean Huo wrote: On Fri, 2021-02-05 at 11:29 +0800, Can Guo wrote: > + return ppn_table[offset]; > +} > + > +static void > +ufshpb_get_pos_from_lpn(struct ufshpb_lu *hpb, unsigned long lpn, > int > *rgn_idx, > + int *srgn_idx, int *offset) > +{ > + int rgn_offset; > + > + *rgn_idx = lpn >> hpb->entries_per_rgn_shift; > + rgn_offset = lpn & hpb->entries_per_rgn_mask; > + *srgn_idx = rgn_offset >> hpb->entries_per_srgn_shift; > + *offset = rgn_offset & hpb->entries_per_srgn_mask; > +} > + > +static void > +ufshpb_set_hpb_read_to_upiu(struct ufshpb_lu *hpb, struct > ufshcd_lrb > *lrbp, > + u32 lpn, u64 ppn, unsigned int > transfer_len) > +{ > + unsigned char *cdb = lrbp->cmd->cmnd; > + > + cdb[0] = UFSHPB_READ; > + > + put_unaligned_be64(ppn, [6]); You are assuming the HPB entries read out by "HPB Read Buffer" cmd are in Little Endian, which is why you are using put_unaligned_be64 here. Actaully, here uses put_unaligned_be64 is no problem. SCSI command should be big-endian filled. I Think the problem is that geting ppn from HPB cache in ufshpb_get_ppn(). whatever... ... e001f: 12 34 56 78 90 fa de ef ... + +static u64 ufshpb_get_ppn(struct ufshpb_lu *hpb, + struct ufshpb_map_ctx *mctx, int pos, int *error) +{ + u64 *ppn_table; // It s a 64 bits pointer + struct page *page; + int index, offset; + + index = pos / (PAGE_SIZE / HPB_ENTRY_SIZE); + offset = pos % (PAGE_SIZE / HPB_ENTRY_SIZE); + + page = mctx->m_page[index]; + if (unlikely(!page)) { + *error = -ENOMEM; + dev_err(>sdev_ufs_lu->sdev_dev, + "error. cannot find page in mctx\n"); + return 0; + } + + ppn_table = page_address(page); + if (unlikely(!ppn_table)) { + *error = -ENOMEM; + dev_err(>sdev_ufs_lu->sdev_dev, + "error. cannot get ppn_table\n"); + return 0; + } + + return ppn_table[offset]; +} this assumption is not right for all the other flash vendors - HPB entries read out by "HPB Read Buffer" cmd may come in Big Endian, if so, their random read performance are screwed. Actually, I have seen at least two flash vendors acting so. I had to modify this line to get the code work properly on my setups. Meanwhile, in your cover letter, you mentioned that the performance data is collected on a UFS2.1 device. Please re-collect the data on a real UFS3.1 device and let me know the part number. Otherwise, the data is not quite convincing to us. Regards, Can Guo.
Re: [PATCH v19 2/3] scsi: ufs: L2P map management for HPB read
On 2021-02-08 16:53, Daejun Park wrote: @@ -342,13 +1208,14 @@ void ufshpb_suspend(struct ufs_hba *hba) > struct scsi_device *sdev; > > shost_for_each_device(sdev, hba->host) { > -hpb = sdev->hostdata; > +hpb = ufshpb_get_hpb_data(sdev); > if (!hpb) > continue; > > if (ufshpb_get_state(hpb) != HPB_PRESENT) > continue; > ufshpb_set_state(hpb, HPB_SUSPEND); > +ufshpb_cancel_jobs(hpb); Here may have a dead lock problem - in the case of runtime suspend, when ufshpb_suspend() is invoked, all of hba's children scsi devices are in RPM_SUSPENDED state. When this line tries to cancel a running map work, i.e. when ufshpb_get_map_req() calls below lines, it will be stuck at blk_queue_enter(). req = blk_get_request(hpb->sdev_ufs_lu->request_queue, REQ_OP_SCSI_IN, 0); Please check block layer power management, and see also commit d55d15a33 ("scsi: block: Do not accept any requests while suspended"). I am agree with your comment. How about add BLK_MQ_REQ_NOWAIT flag on blk_get_request() to avoid hang? That won't work - BLK_MQ_REQ_NOWAIT allows one to fast fail from blk_mq_get_tag(), but blk_queue_enter() comes before __blk_mq_alloc_request(); In blk_queue_enter(), BLK_MQ_REQ_NOWAIT flag can make error than wait rpm resume. Please refer following code. Oops, sorry, my memory needs to be refreshed on that part. But will BLK_MQ_REQ_NOWAIT flag breaks your original purpose? When runtime suspend is out of the picture, if traffic is heavy on the request queue, map_work() will be stopped frequently once it is not able to get a request from the queue - that shall pull down the efficiency of one map_work(), that may hurt random performance... Can Guo. int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) { const bool pm = flags & BLK_MQ_REQ_PM; while (true) { bool success = false; rcu_read_lock(); if (percpu_ref_tryget_live(>q_usage_counter)) { /* * The code that increments the pm_only counter is * responsible for ensuring that that counter is * globally visible before the queue is unfrozen. */ if ((pm && queue_rpm_status(q) != RPM_SUSPENDED) || !blk_queue_pm_only(q)) { success = true; } else { percpu_ref_put(>q_usage_counter); } } rcu_read_unlock(); if (success) return 0; if (flags & BLK_MQ_REQ_NOWAIT) return -EBUSY; <-- out from the function. Thanks, Daejun
Re: [PATCH v19 2/3] scsi: ufs: L2P map management for HPB read
On 2021-02-08 16:03, Daejun Park wrote: @@ -342,13 +1208,14 @@ void ufshpb_suspend(struct ufs_hba *hba) >struct scsi_device *sdev; > >shost_for_each_device(sdev, hba->host) { > - hpb = sdev->hostdata; > + hpb = ufshpb_get_hpb_data(sdev); >if (!hpb) >continue; > >if (ufshpb_get_state(hpb) != HPB_PRESENT) >continue; >ufshpb_set_state(hpb, HPB_SUSPEND); > + ufshpb_cancel_jobs(hpb); Here may have a dead lock problem - in the case of runtime suspend, when ufshpb_suspend() is invoked, all of hba's children scsi devices are in RPM_SUSPENDED state. When this line tries to cancel a running map work, i.e. when ufshpb_get_map_req() calls below lines, it will be stuck at blk_queue_enter(). req = blk_get_request(hpb->sdev_ufs_lu->request_queue, REQ_OP_SCSI_IN, 0); Please check block layer power management, and see also commit d55d15a33 ("scsi: block: Do not accept any requests while suspended"). I am agree with your comment. How about add BLK_MQ_REQ_NOWAIT flag on blk_get_request() to avoid hang? That won't work - BLK_MQ_REQ_NOWAIT allows one to fast fail from blk_mq_get_tag(), but blk_queue_enter() comes before __blk_mq_alloc_request(); Regards, Can Guo. Thanks, Daejun
Re: [PATCH v19 2/3] scsi: ufs: L2P map management for HPB read
On 2021-01-29 13:30, Daejun Park wrote: This is a patch for managing L2P map in HPB module. The HPB divides logical addresses into several regions. A region consists of several sub-regions. The sub-region is a basic unit where L2P mapping is managed. The driver loads L2P mapping data of each sub-region. The loaded sub-region is called active-state. The HPB driver unloads L2P mapping data as region unit. The unloaded region is called inactive-state. Sub-region/region candidates to be loaded and unloaded are delivered from the UFS device. The UFS device delivers the recommended active sub-region and inactivate region to the driver using sensedata. The HPB module performs L2P mapping management on the host through the delivered information. A pinned region is a pre-set regions on the UFS device that is always activate-state. The data structure for map data request and L2P map uses mempool API, minimizing allocation overhead while avoiding static allocation. The mininum size of the memory pool used in the HPB is implemented as a module parameter, so that it can be configurable by the user. To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096 The map_work manages active/inactive by 2 "to-do" lists. Each hpb lun maintains 2 "to-do" lists: hpb->lh_inact_rgn - regions to be inactivated, and hpb->lh_act_srgn - subregions to be activated Those lists are maintained on IO completion. Reviewed-by: Bart Van Assche Reviewed-by: Can Guo Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h| 36 ++ drivers/scsi/ufs/ufshcd.c | 4 + drivers/scsi/ufs/ufshpb.c | 993 +- drivers/scsi/ufs/ufshpb.h | 65 +++ 4 files changed, 1083 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 65563635e20e..075c12e7de7e 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -472,6 +472,41 @@ struct utp_cmd_rsp { u8 sense_data[UFS_SENSE_SIZE]; }; +struct ufshpb_active_field { + __be16 active_rgn; + __be16 active_srgn; +}; +#define HPB_ACT_FIELD_SIZE 4 + +/** + * struct utp_hpb_rsp - Response UPIU structure + * @residual_transfer_count: Residual transfer count DW-3 + * @reserved1: Reserved double words DW-4 to DW-7 + * @sense_data_len: Sense data length DW-8 U16 + * @desc_type: Descriptor type of sense data + * @additional_len: Additional length of sense data + * @hpb_op: HPB operation type + * @reserved2: Reserved field + * @active_rgn_cnt: Active region count + * @inactive_rgn_cnt: Inactive region count + * @hpb_active_field: Recommended to read HPB region and subregion + * @hpb_inactive_field: To be inactivated HPB region and subregion + */ +struct utp_hpb_rsp { + __be32 residual_transfer_count; + __be32 reserved1[4]; + __be16 sense_data_len; + u8 desc_type; + u8 additional_len; + u8 hpb_op; + u8 reserved2; + u8 active_rgn_cnt; + u8 inactive_rgn_cnt; + struct ufshpb_active_field hpb_active_field[2]; + __be16 hpb_inactive_field[2]; +}; +#define UTP_HPB_RSP_SIZE 40 + /** * struct utp_upiu_rsp - general upiu response structure * @header: UPIU header structure DW-0 to DW-2 @@ -482,6 +517,7 @@ struct utp_upiu_rsp { struct utp_upiu_header header; union { struct utp_cmd_rsp sr; + struct utp_hpb_rsp hr; struct utp_upiu_query qr; }; }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b8d6a52f5603..52e48de8d27c 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5018,6 +5018,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) */ pm_runtime_get_noresume(hba->dev); } + + if (scsi_status == SAM_STAT_GOOD) + ufshpb_rsp_upiu(hba, lrbp); break; case UPIU_TRANSACTION_REJECT_UPIU: /* TODO: handle Reject UPIU Response */ @@ -9228,6 +9231,7 @@ EXPORT_SYMBOL(ufshcd_shutdown); void ufshcd_remove(struct ufs_hba *hba) { ufs_bsg_remove(hba); + ufshpb_remove(hba); ufs_sysfs_remove_nodes(hba->dev); blk_cleanup_queue(hba->tmf_queue); blk_mq_free_tag_set(>tmf_tag_set); diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 1f84141ed384..48edfdd0f606 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,11 +16,73 @@ #include "ufshpb.h" #include "../sd.h" +/* memory management */ +static struct kmem_cache *ufshpb_mctx_cache; +static mempool_t *ufshpb_mctx_pool; +static mempool_t *ufshpb_page_pool; +/* A cache size of 2MB can cache ppn in the 1GB range. */ +static unsigned int ufshpb_host_map_kbytes =
Re: [PATCH v2 6/9] scsi: ufshpb: Add hpb dev reset response
On 2021-02-02 16:30, Avri Altman wrote: The spec does not define what is the host's recommended response when the device send hpb dev reset response (oper 0x2). We will update all active hpb regions: mark them and do that on the next read. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 54 --- drivers/scsi/ufs/ufshpb.h | 1 + 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 49c74de539b7..28e0025507a1 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -17,6 +17,7 @@ #include "../sd.h" #define WORK_PENDING 0 +#define RESET_PENDING 1 #define ACTIVATION_THRSHLD 4 /* 4 IOs */ #define EVICTION_THRSHLD (ACTIVATION_THRSHLD << 6) /* 256 IOs */ @@ -349,7 +350,8 @@ void ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) if (rgn->reads == ACTIVATION_THRSHLD) activate = true; spin_unlock_irqrestore(>rgn_lock, flags); - if (activate) { + if (activate || + test_and_clear_bit(RGN_FLAG_UPDATE, >rgn_flags)) { spin_lock_irqsave(>rsp_list_lock, flags); ufshpb_update_active_info(hpb, rgn_idx, srgn_idx); hpb->stats.rb_active_cnt++; @@ -1068,6 +1070,24 @@ void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) case HPB_RSP_DEV_RESET: dev_warn(>sdev_ufs_lu->sdev_dev, "UFS device lost HPB information during PM.\n"); + + if (hpb->is_hcm) { + struct ufshpb_lu *h; + struct scsi_device *sdev; + + shost_for_each_device(sdev, hba->host) { I haven't test it yet, but this line shall cause recursive spin lock - in current code base, ufshpb_rsp_upiu() is called with host_lock held. Regards, Can Guo. + h = sdev->hostdata; + if (!h) + continue; + + if (test_and_set_bit(RESET_PENDING, +>work_data_bits)) + continue; + + schedule_work(>ufshpb_lun_reset_work); + } + } + break; default: dev_notice(>sdev_ufs_lu->sdev_dev, @@ -1200,6 +1220,27 @@ static void ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb) spin_unlock_irqrestore(>rsp_list_lock, flags); } +static void ufshpb_reset_work_handler(struct work_struct *work) +{ + struct ufshpb_lu *hpb; + struct victim_select_info *lru_info; + struct ufshpb_region *rgn; + unsigned long flags; + + hpb = container_of(work, struct ufshpb_lu, ufshpb_lun_reset_work); + + lru_info = >lru_info; + + spin_lock_irqsave(>rgn_state_lock, flags); + + list_for_each_entry(rgn, _info->lh_lru_rgn, list_lru_rgn) + set_bit(RGN_FLAG_UPDATE, >rgn_flags); + + spin_unlock_irqrestore(>rgn_state_lock, flags); + + clear_bit(RESET_PENDING, >work_data_bits); +} + static void ufshpb_normalization_work_handler(struct work_struct *work) { struct ufshpb_lu *hpb; @@ -1392,6 +1433,8 @@ static int ufshpb_alloc_region_tbl(struct ufs_hba *hba, struct ufshpb_lu *hpb) } else { rgn->rgn_state = HPB_RGN_INACTIVE; } + + rgn->rgn_flags = 0; } return 0; @@ -1502,9 +1545,12 @@ static int ufshpb_lu_hpb_init(struct ufs_hba *hba, struct ufshpb_lu *hpb) INIT_LIST_HEAD(>list_hpb_lu); INIT_WORK(>map_work, ufshpb_map_work_handler); - if (hpb->is_hcm) + if (hpb->is_hcm) { INIT_WORK(>ufshpb_normalization_work, ufshpb_normalization_work_handler); + INIT_WORK(>ufshpb_lun_reset_work, + ufshpb_reset_work_handler); + } hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache", sizeof(struct ufshpb_req), 0, 0, NULL); @@ -1591,8 +1637,10 @@ static void ufshpb_discard_rsp_lists(struct ufshpb_lu *hpb) static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb) { - if (hpb->is_hcm) + if (hpb->is_hcm) { + cancel_work_sync(>ufshpb_lun_reset_work); cancel_work_sync(>ufshpb_normalization_work); + } cancel_work_sync(>map_work); } diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h index 71b082ee7876..e55892ceb3fc 100644 --- a/drivers/scsi/ufs/ufshpb.h +++ b/drivers/scsi/ufs/ufshpb.h @@ -184,6 +184,7 @@ struct ufshpb_lu { /* for sel
Re: [PATCH v19 3/3] scsi: ufs: Prepare HPB read for cached sub-region
On 2021-02-05 23:08, Bean Huo wrote: On Fri, 2021-02-05 at 14:06 +, Avri Altman wrote: > > > + put_unaligned_be64(ppn, [6]); > > > > You are assuming the HPB entries read out by "HPB Read Buffer" > > cmd > > are > > in Little > > Endian, which is why you are using put_unaligned_be64 here. > > However, > > this assumption > > is not right for all the other flash vendors - HPB entries read > > out > > by > > "HPB Read Buffer" > > cmd may come in Big Endian, if so, their random read performance > > are > > screwed. > > For this question, it is very hard to make a correct format since > the > Spec doesn't give a clear definition. Should we have a default > format, > if there is conflict, and then add quirk or add a vendor-specific > table? > > Hi Avri > Do you have a good idea? I don't know. Better let Daejun answer this. This was working for me for both Galaxy S20 (Exynos) as well as Xiaomi Mi10 (8250). Thanks, I tested Daejun's patchset before, it is also ok (I don't know which version patchset). maybe we can keep current implementation as default, then if there is conflict, and submit the quirk. Yeah, you've tested it, are you sure that Micron's UFS devices are OK with this specific code line? Micron UFS FW team has confirmed that Micron's HPB entries read out by "HPB Buffer Read" cmd are in big-endian byte ordering. If Micron FW team is right, I am pretty sure that you would have seen random read performance regression on Micron UFS devices caused by invalid HPB entry format in HPB Read cmd UPIU (which leads to L2P cache miss in device side all the time) during your test. Can Guo. Thanks, Bean Thanks, Avri
Re: [PATCH v19 2/3] scsi: ufs: L2P map management for HPB read
On 2021-01-29 13:30, Daejun Park wrote: This is a patch for managing L2P map in HPB module. The HPB divides logical addresses into several regions. A region consists of several sub-regions. The sub-region is a basic unit where L2P mapping is managed. The driver loads L2P mapping data of each sub-region. The loaded sub-region is called active-state. The HPB driver unloads L2P mapping data as region unit. The unloaded region is called inactive-state. Sub-region/region candidates to be loaded and unloaded are delivered from the UFS device. The UFS device delivers the recommended active sub-region and inactivate region to the driver using sensedata. The HPB module performs L2P mapping management on the host through the delivered information. A pinned region is a pre-set regions on the UFS device that is always activate-state. The data structure for map data request and L2P map uses mempool API, minimizing allocation overhead while avoiding static allocation. The mininum size of the memory pool used in the HPB is implemented as a module parameter, so that it can be configurable by the user. To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096 The map_work manages active/inactive by 2 "to-do" lists. Each hpb lun maintains 2 "to-do" lists: hpb->lh_inact_rgn - regions to be inactivated, and hpb->lh_act_srgn - subregions to be activated Those lists are maintained on IO completion. Reviewed-by: Bart Van Assche Reviewed-by: Can Guo Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufs.h| 36 ++ drivers/scsi/ufs/ufshcd.c | 4 + drivers/scsi/ufs/ufshpb.c | 993 +- drivers/scsi/ufs/ufshpb.h | 65 +++ 4 files changed, 1083 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index 65563635e20e..075c12e7de7e 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -472,6 +472,41 @@ struct utp_cmd_rsp { u8 sense_data[UFS_SENSE_SIZE]; }; +struct ufshpb_active_field { + __be16 active_rgn; + __be16 active_srgn; +}; +#define HPB_ACT_FIELD_SIZE 4 + +/** + * struct utp_hpb_rsp - Response UPIU structure + * @residual_transfer_count: Residual transfer count DW-3 + * @reserved1: Reserved double words DW-4 to DW-7 + * @sense_data_len: Sense data length DW-8 U16 + * @desc_type: Descriptor type of sense data + * @additional_len: Additional length of sense data + * @hpb_op: HPB operation type + * @reserved2: Reserved field + * @active_rgn_cnt: Active region count + * @inactive_rgn_cnt: Inactive region count + * @hpb_active_field: Recommended to read HPB region and subregion + * @hpb_inactive_field: To be inactivated HPB region and subregion + */ +struct utp_hpb_rsp { + __be32 residual_transfer_count; + __be32 reserved1[4]; + __be16 sense_data_len; + u8 desc_type; + u8 additional_len; + u8 hpb_op; + u8 reserved2; + u8 active_rgn_cnt; + u8 inactive_rgn_cnt; + struct ufshpb_active_field hpb_active_field[2]; + __be16 hpb_inactive_field[2]; +}; +#define UTP_HPB_RSP_SIZE 40 + /** * struct utp_upiu_rsp - general upiu response structure * @header: UPIU header structure DW-0 to DW-2 @@ -482,6 +517,7 @@ struct utp_upiu_rsp { struct utp_upiu_header header; union { struct utp_cmd_rsp sr; + struct utp_hpb_rsp hr; struct utp_upiu_query qr; }; }; diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index b8d6a52f5603..52e48de8d27c 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5018,6 +5018,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) */ pm_runtime_get_noresume(hba->dev); } + + if (scsi_status == SAM_STAT_GOOD) + ufshpb_rsp_upiu(hba, lrbp); break; case UPIU_TRANSACTION_REJECT_UPIU: /* TODO: handle Reject UPIU Response */ @@ -9228,6 +9231,7 @@ EXPORT_SYMBOL(ufshcd_shutdown); void ufshcd_remove(struct ufs_hba *hba) { ufs_bsg_remove(hba); + ufshpb_remove(hba); ufs_sysfs_remove_nodes(hba->dev); blk_cleanup_queue(hba->tmf_queue); blk_mq_free_tag_set(>tmf_tag_set); diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 1f84141ed384..48edfdd0f606 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -16,11 +16,73 @@ #include "ufshpb.h" #include "../sd.h" +/* memory management */ +static struct kmem_cache *ufshpb_mctx_cache; +static mempool_t *ufshpb_mctx_pool; +static mempool_t *ufshpb_page_pool; +/* A cache size of 2MB can cache ppn in the 1GB range. */ +static unsigned int ufshpb_host_map_kbytes =
Re: [PATCH v3 3/3] scsi: ufs: Fix wrong Task Tag used in task management request UPIUs
On 2021-02-01 10:39, Bart Van Assche wrote: On 1/28/21 9:57 PM, Can Guo wrote: On 2021-01-29 11:15, Bart Van Assche wrote: On 1/27/21 8:16 PM, Can Guo wrote: In __ufshcd_issue_tm_cmd(), it is not right to use hba->nutrs + req->tag as the Task Tag in one TMR UPIU. Directly use req->tag as the Task Tag. Why is the current code wrong and why is this patch the proper fix? Please explain this in the patch description. req->tag is the tag allocated for one TMR, no? Hi Can, Commit e293313262d3 ("scsi: ufs: Fix broken task management command implementation") includes the following changes: + task_tag = hba->nutrs + free_slot; task_req_upiup->header.dword_0 = UPIU_HEADER_DWORD(UPIU_TRANSACTION_TASK_REQ, 0, -lrbp->lun, lrbp->task_tag); +lun_id, task_tag); task_req_upiup->header.dword_1 = UPIU_HEADER_DWORD(0, tm_function, 0, 0); As one can see the value written in dword_0 starts at hba->nutrs. Was that code correct? If that code was correct, does your patch perhaps break task management support? That code is wrong. The Task Tag in Dword_0 should be the real tag we allocated for TMR. The transfer request Task Tag which we are trying to abort is given in Dword_5, which is the Input Parameter 3 of the TMR UPIU. I am not sure why the author gave hba->nutrs + req->tag as the Task Tag of one TMR, the commit msg abot this part is not quite informative Table 10.22 — Task Management Request UPIU TASK MANAGEMENT REQUEST UPIU -- |0 |1 |2 |3 | -- |xx00 0100b| Flags |LUN |Task Tag| -- ... 16 (MSB) |17 |18 |19 (LSB)| -- Input Parameter 2 -- Table 10.24 — Task Management Input Parameters Field Description Input Parameter 2 LSB: Task Tag of the task/command operated by the task management function. Thanks, Can Guo. Thanks, Bart.
Re: [PATCH v19 3/3] scsi: ufs: Prepare HPB read for cached sub-region
On 2021-01-29 13:30, Daejun Park wrote: This patch changes the read I/O to the HPB read I/O. If the logical address of the read I/O belongs to active sub-region, the HPB driver modifies the read I/O command to HPB read. It modifies the UPIU command of UFS instead of modifying the existing SCSI command. In the HPB version 1.0, the maximum read I/O size that can be converted to HPB read is 4KB. The dirty map of the active sub-region prevents an incorrect HPB read that has stale physical page number which is updated by previous write I/O. Reviewed-by: Can Guo Reviewed-by: Bart Van Assche Acked-by: Avri Altman Tested-by: Bean Huo Signed-off-by: Daejun Park --- drivers/scsi/ufs/ufshcd.c | 2 + drivers/scsi/ufs/ufshpb.c | 234 ++ drivers/scsi/ufs/ufshpb.h | 2 + 3 files changed, 238 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 52e48de8d27c..37cb343e9ec1 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2653,6 +2653,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) lrbp->req_abort_skip = false; + ufshpb_prep(hba, lrbp); + ufshcd_comp_scsi_upiu(hba, lrbp); err = ufshcd_map_sg(hba, lrbp); diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index 48edfdd0f606..73e7b3ed04a4 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -31,6 +31,29 @@ bool ufshpb_is_allowed(struct ufs_hba *hba) return !(hba->ufshpb_dev.hpb_disabled); } +static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn, +struct ufshpb_subregion *srgn) +{ + return rgn->rgn_state != HPB_RGN_INACTIVE && + srgn->srgn_state == HPB_SRGN_VALID; +} + +static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd) +{ + return req_op(cmd->request) == REQ_OP_READ; +} + +static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd) +{ + return op_is_write(req_op(cmd->request)) || + op_is_discard(req_op(cmd->request)); +} + +static bool ufshpb_is_support_chunk(int transfer_len) +{ + return transfer_len <= HPB_MULTI_CHUNK_HIGH; +} + static bool ufshpb_is_general_lun(int lun) { return lun < UFS_UPIU_MAX_UNIT_NUM_ID; @@ -98,6 +121,217 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int state) atomic_set(>hpb_state, state); } +static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, +int srgn_idx, int srgn_offset, int cnt) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + int set_bit_len; + int bitmap_len = hpb->entries_per_srgn; + +next_srgn: + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + if ((srgn_offset + cnt) > bitmap_len) + set_bit_len = bitmap_len - srgn_offset; + else + set_bit_len = cnt; + + if (rgn->rgn_state != HPB_RGN_INACTIVE && + srgn->srgn_state == HPB_SRGN_VALID) + bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len); + + srgn_offset = 0; + if (++srgn_idx == hpb->srgns_per_rgn) { + srgn_idx = 0; + rgn_idx++; + } + + cnt -= set_bit_len; + if (cnt > 0) + goto next_srgn; + + WARN_ON(cnt < 0); +} + +static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx, + int srgn_idx, int srgn_offset, int cnt) +{ + struct ufshpb_region *rgn; + struct ufshpb_subregion *srgn; + int bitmap_len = hpb->entries_per_srgn; + int bit_len; + +next_srgn: + rgn = hpb->rgn_tbl + rgn_idx; + srgn = rgn->srgn_tbl + srgn_idx; + + if (!ufshpb_is_valid_srgn(rgn, srgn)) + return true; + + /* +* If the region state is active, mctx must be allocated. +* In this case, check whether the region is evicted or +* mctx allcation fail. +*/ + WARN_ON(!srgn->mctx); + + if ((srgn_offset + cnt) > bitmap_len) + bit_len = bitmap_len - srgn_offset; + else + bit_len = cnt; + + if (find_next_bit(srgn->mctx->ppn_dirty, + bit_len, srgn_offset) >= srgn_offset) + return true; + + srgn_offset = 0; + if (++srgn_idx == hpb->srgns_per_rgn) { + srgn_idx = 0; + rgn_idx++; + } + + cnt -= bit_len; + if (cnt > 0) + goto next_srgn; + + return false; +} + +static u64 ufshpb_get_ppn(struct ufshpb_lu *hpb, + struct ufshpb_map_ctx *mctx, int pos, int *error) +{ + u64 *ppn_table; + struct page *page; + int index, offset; + + index = pos / (PAGE_SIZE / HPB_ENTRY_SIZE); +
Re: [PATCH v3 2/3] scsi: ufs: Fix a race condition btw task management request send and compl
On 2021-01-29 14:06, Can Guo wrote: On 2021-01-29 11:20, Bart Van Assche wrote: On 1/27/21 8:16 PM, Can Guo wrote: ufshcd_compl_tm() looks for all 0 bits in the REG_UTP_TASK_REQ_DOOR_BELL and call complete() for each req who has the req->end_io_data set. There can be a race condition btw tmc send/compl, because the req->end_io_data is set, in __ufshcd_issue_tm_cmd(), without host lock protection, so it is possible that when ufshcd_compl_tm() checks the req->end_io_data, it is set but the corresponding tag has not been set in REG_UTP_TASK_REQ_DOOR_BELL. Thus, ufshcd_tmc_handler() may wrongly complete TMRs which have not been sent out. Fix it by protecting req->end_io_data with host lock, and let ufshcd_compl_tm() only handle those tm cmds which have been completed instead of looking for 0 bits in the REG_UTP_TASK_REQ_DOOR_BELL. I don't know any other block driver that needs locking to protect races between submission and completion context. Can the block layer timeout mechanism be used instead of the mechanism introduced by this patch, e.g. by using blk_execute_rq_nowait() to submit requests? That would allow to reuse the existing mechanism in the block layer core to handle races between request completion and timeout handling. This patch is not introducing any new mechanism, it is fixing the usage of completion (req->end_io_data = c) introduced by commit 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs"). If you have better idea to get it fixed once for all, we are glad to take your change to get it fixed asap. Regards, Can Guo. On second thought, actually the 1st fix alone is enough to eliminate the race condition. Because blk_mq_tagset_busy_iter() only iterates over all requests which are not in IDLE state, if blk_mq_start_request() is called within the protection of host spin lock, ufshcd_compl_tm() shall not run into the scenario where req->end_io_data is set but REG_UTP_TASK_REQ_DOOR_BELL has not been set. What do you think? Thanks, Can Guo. Thanks, Bart.
Re: [PATCH v3 2/3] scsi: ufs: Fix a race condition btw task management request send and compl
On 2021-01-29 11:20, Bart Van Assche wrote: On 1/27/21 8:16 PM, Can Guo wrote: ufshcd_compl_tm() looks for all 0 bits in the REG_UTP_TASK_REQ_DOOR_BELL and call complete() for each req who has the req->end_io_data set. There can be a race condition btw tmc send/compl, because the req->end_io_data is set, in __ufshcd_issue_tm_cmd(), without host lock protection, so it is possible that when ufshcd_compl_tm() checks the req->end_io_data, it is set but the corresponding tag has not been set in REG_UTP_TASK_REQ_DOOR_BELL. Thus, ufshcd_tmc_handler() may wrongly complete TMRs which have not been sent out. Fix it by protecting req->end_io_data with host lock, and let ufshcd_compl_tm() only handle those tm cmds which have been completed instead of looking for 0 bits in the REG_UTP_TASK_REQ_DOOR_BELL. I don't know any other block driver that needs locking to protect races between submission and completion context. Can the block layer timeout mechanism be used instead of the mechanism introduced by this patch, e.g. by using blk_execute_rq_nowait() to submit requests? That would allow to reuse the existing mechanism in the block layer core to handle races between request completion and timeout handling. This patch is not introducing any new mechanism, it is fixing the usage of completion (req->end_io_data = c) introduced by commit 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs"). If you have better idea to get it fixed once for all, we are glad to take your change to get it fixed asap. Regards, Can Guo. Thanks, Bart.