Re: Another approach of UFSHPB
On Wed, 2020-05-27 at 18:11 +0900, Daejun Park wrote: > > > > I do not agree that the bus model is the best choice for > > > > freeing cache > > > > memory if it is no longer needed. A shrinker is probably a much > > > > better > > > > choice because the callback functions in a shrinker get invoked > > > > when a > > > > system is under memory pressure. See also register_shrinker(), > > > > unregister_shrinker() and struct shrinker in > > > > include/linux/shrinker.h. > > > > > > Since this discussion is closely related to cache allocation, > > > What is your opinion about allocating the pages dynamically as > > > the regions > > > Are being activated/deactivated, in oppose of how it is done > > > today - > > > Statically on init for the entire max-active-subregions? > > Memory that is statically allocated cannot be used for any other > > purpose > > (e.g. page cache) without triggering the associated shrinker. As > > far as > > I know shrinkers are only triggered when (close to) out of memory. > > So > > dynamically allocating memory as needed is probably a better > > strategy > > than statically allocating the entire region at initialization > > time. > > To improve UFS device performance using the HPB driver, > the number of active-subregions above a certain threshold is > essential. > If the number of active-subregions is lower than the threshold, > the performance improvement by using HPB will be reduced. Right, but for the embedded system, it is different to mobile usage case, there should be a maximum limit in the HPB driver, exactly how much MB of HPB cache, should let the user choose. > Also, due to frequent and active/inactive protocol overhead, > performance may be worse than when the HPB feature is not used. > > Therefore, it is better to unbind/unload HPB driver than > to reduce the number of active subregions below the threshold. > We designed the HPB driver to make the UFS device work > even when the module is unloaded. > Actually, along with the HPB running, from point of the HPB cache consumption, no big difference between dynamical HPB page allocation and statical allocation. on the contrary, dynamically HPB page allocation will increase the latency in the HPB entries loading path. //Bean
Re: Another approach of UFSHPB
On 2020-05-26 Bart Van Assche wrote: >On 2020-05-25 23:15, Avri Altman wrote: >>> On 2020-05-24 22:40, Daejun Park wrote: The HPB driver is close to the UFS core function, but it is not essential for operating UFS device. With reference to this article (https://lwn.net/Articles/645810/), we implemented extended UFS-feature as bus model. Because the HPB driver consumes the user's main memory, it >>> should support bind / unbind functionality as needed. We implemented the HPB >>> driver can be unbind / unload on runtime. >>> >>> I do not agree that the bus model is the best choice for freeing cache >>> memory if it is no longer needed. A shrinker is probably a much better >>> choice because the callback functions in a shrinker get invoked when a >>> system is under memory pressure. See also register_shrinker(), >>> unregister_shrinker() and struct shrinker in include/linux/shrinker.h. >> >> Since this discussion is closely related to cache allocation, >> What is your opinion about allocating the pages dynamically as the regions >> Are being activated/deactivated, in oppose of how it is done today - >> Statically on init for the entire max-active-subregions? > Memory that is statically allocated cannot be used for any other purpose > (e.g. page cache) without triggering the associated shrinker. As far as > I know shrinkers are only triggered when (close to) out of memory. So > dynamically allocating memory as needed is probably a better strategy > than statically allocating the entire region at initialization time. To improve UFS device performance using the HPB driver, the number of active-subregions above a certain threshold is essential. If the number of active-subregions is lower than the threshold, the performance improvement by using HPB will be reduced. Also, due to frequent and active/inactive protocol overhead, performance may be worse than when the HPB feature is not used. Therefore, it is better to unbind/unload HPB driver than to reduce the number of active subregions below the threshold. We designed the HPB driver to make the UFS device work even when the module is unloaded. Thanks, Daejun
Re: Another approach of UFSHPB
On 2020-05-25 23:15, Avri Altman wrote: >> On 2020-05-24 22:40, Daejun Park wrote: >>> The HPB driver is close to the UFS core function, but it is not essential >>> for operating UFS device. With reference to this article >>> (https://lwn.net/Articles/645810/), we implemented extended UFS-feature >>> as bus model. Because the HPB driver consumes the user's main memory, it >> should >>> support bind / unbind functionality as needed. We implemented the HPB >> driver >>> can be unbind / unload on runtime. >> >> I do not agree that the bus model is the best choice for freeing cache >> memory if it is no longer needed. A shrinker is probably a much better >> choice because the callback functions in a shrinker get invoked when a >> system is under memory pressure. See also register_shrinker(), >> unregister_shrinker() and struct shrinker in include/linux/shrinker.h. > > Since this discussion is closely related to cache allocation, > What is your opinion about allocating the pages dynamically as the regions > Are being activated/deactivated, in oppose of how it is done today - > Statically on init for the entire max-active-subregions? Memory that is statically allocated cannot be used for any other purpose (e.g. page cache) without triggering the associated shrinker. As far as I know shrinkers are only triggered when (close to) out of memory. So dynamically allocating memory as needed is probably a better strategy than statically allocating the entire region at initialization time. Thanks, Bart.
RE: Another approach of UFSHPB
> On 2020-05-24 22:40, Daejun Park wrote: > > The HPB driver is close to the UFS core function, but it is not essential > > for operating UFS device. With reference to this article > > (https://lwn.net/Articles/645810/), we implemented extended UFS-feature > > as bus model. Because the HPB driver consumes the user's main memory, it > should > > support bind / unbind functionality as needed. We implemented the HPB > driver > > can be unbind / unload on runtime. > > I do not agree that the bus model is the best choice for freeing cache > memory if it is no longer needed. A shrinker is probably a much better > choice because the callback functions in a shrinker get invoked when a > system is under memory pressure. See also register_shrinker(), > unregister_shrinker() and struct shrinker in include/linux/shrinker.h. Since this discussion is closely related to cache allocation, What is your opinion about allocating the pages dynamically as the regions Are being activated/deactivated, in oppose of how it is done today - Statically on init for the entire max-active-subregions?
Re: Another approach of UFSHPB
On 2020-05-24 22:40, Daejun Park wrote: > The HPB driver is close to the UFS core function, but it is not essential > for operating UFS device. With reference to this article > (https://lwn.net/Articles/645810/), we implemented extended UFS-feature > as bus model. Because the HPB driver consumes the user's main memory, it > should > support bind / unbind functionality as needed. We implemented the HPB driver > can be unbind / unload on runtime. I do not agree that the bus model is the best choice for freeing cache memory if it is no longer needed. A shrinker is probably a much better choice because the callback functions in a shrinker get invoked when a system is under memory pressure. See also register_shrinker(), unregister_shrinker() and struct shrinker in include/linux/shrinker.h. >> Should these parameters be per module or per UFS device? > > I think it is necessary to take parameters for each module. > This is because each extended UFS-feature module has different functions > and may require different parameters. My question was a rhetorical question. Please choose per device parameters when appropriate instead of module parameters. Thanks, Bart.
Re: Another approach of UFSHPB
I am Daejun Park who is working to patch HPB driver. Thank you for your comment, and the following is our answer. > Splitting the UFS driver into multiple modules would be great if the > interface between these modules can be kept small and elegant. However, > I'm not sure that this approach should be based on Linux device driver > bus concept. Devices can be unbound at any time from their driver by > writing into the "unbind" sysfs attribute. I don't think we want the UFS > core functionality ever to be unbound while any other UFS functionality > is still active. Has it been considered to implement each feature as a > loadable module without relying on the bus model? The existing kernel > module infrastructure already prevents to unload modules (e.g. the UFS > core) that are in use by a kernel module that depends on it (e.g. UFS HPB). The HPB driver is close to the UFS core function, but it is not essential for operating UFS device. With reference to this article (https://lwn.net/Articles/645810/), we implemented extended UFS-feature as bus model. Because the HPB driver consumes the user's main memory, it should support bind / unbind functionality as needed. We implemented the HPB driver can be unbind / unload on runtime. > What will happen if a feature module is unloaded (e.g. HPB) while I/O is > ongoing that relies on HPB? In the HPB driver, it checks whether the HPB can be unload / unbind through the reference counter. Therefore, there is no problem that the driver is unloaded / unbind when there is I/O related to HPB. > Should these parameters be per module or per UFS device? I think it is necessary to take parameters for each module. This is because each extended UFS-feature module has different functions and may require different parameters. Thanks, Daejun Park.
Re: Another approach of UFSHPB
On 2020-05-19 15:31, yongmyung lee wrote: > Currently, UFS driver (usually ufshcd.c) has become bulky and complex. > So, I would like to split these codes into layers > like the works of Bean Huo and Avril Altman. > Especially, I suggest the UFS-Feature Driver model based on Linux Bus-Driver > Model, > which is suitable to manage all Extended UFS-Feature drivers like the Figure > as below: > > > UFS Driver data structure (struct ufs_hba) >| >|----- ufshpb driver -- <- attach ufshpb > device driver (it can be loadable) >|---| ufs-ext feature layer | -- ufs-wb driver -- <- attach ufswb device > driver >| | | -- ... -- <- ... >|----- next ufs feature driver -- <- attach > ufs-next feature driver > > * wb : write-booster Splitting the UFS driver into multiple modules would be great if the interface between these modules can be kept small and elegant. However, I'm not sure that this approach should be based on Linux device driver bus concept. Devices can be unbound at any time from their driver by writing into the "unbind" sysfs attribute. I don't think we want the UFS core functionality ever to be unbound while any other UFS functionality is still active. Has it been considered to implement each feature as a loadable module without relying on the bus model? The existing kernel module infrastructure already prevents to unload modules (e.g. the UFS core) that are in use by a kernel module that depends on it (e.g. UFS HPB). > Furthermore, each ufs-ext feature driver will be written as a loadable kernel > module. > Vendors (e.g., Android Phone manufacturer) could optionally load and remove > each module. What will happen if a feature module is unloaded (e.g. HPB) while I/O is ongoing that relies on HPB? > Also they can customize the parameters of ufs-ext feature drivers > while each module is being loaded. > (For example, vendor would set the maximum memory size > that can be reclaimed in the Host Control mode in HPB) Should these parameters be per module or per UFS device? > In addition, we plan to provide QEMU with UFS-simulator > for a test environment for UFS driver development. A UFS simulator for QEMU support would definitely be welcome. Thanks, Bart.
Re: Another approach of UFSHPB
On 2020-05-20 14:19, Bart Van Assche wrote: > On 2020-05-20 10:55, Christoph Hellwig wrote: >> HPB is a completely fucked up concept and we shoud not merge it at all. >> Especially not with a crazy bullshit vendor extension layer that makes >> it even easier for vendors to implement even worse things than the >> already horrible spec says. Just stop this crap and implement sane >> interfaces for the next generation hardware instead of wasting your >> time on this idiotic idea. > > What exactly is it that you are not happy about? Is it the concept of > using host memory to store L2P translation information or how that > concept has been translated into SCSI commands (HPB READ BUFFER, HPB > READ and HPB WRITE BUFFER)? > > In the former case: aren't Open-Channel SSDs another example of storage > devices for which the L2P translation tables are maintained in host > memory? Didn't the driver for Fusion-io SSDs also maintain the L2P > mapping in host memory? > > Do you agree that HPB UFS storage devices are already being used widely > and hence that not accepting this functionality in the upstream kernel > will force users of HPB devices to maintain HPB code outside the kernel > tree? Isn't one of the goals of the Linux kernel project to increase its > user base? The following quote from https://www.anandtech.com/show/13474/the-google-pixel-3-review/2 is interesting: "Another big improvement for file I/O is the implementation of “Host Performance Booster” in the kernel and UFS controller firmware stack. HPB is essentially caching of the NAND chip’s FTL (flash translation layer) L2P (logical to physical) mapping tables into the hosts (SoCs) main memory. This allows the host driver to look up the target L2P entry directly without betting on UFS’s limited SRAM to have a cache-hit, reducing latency and greatly increasing random read performance. The authors of the feature showcase an improvement of 59-67% in random I/O read performance due to the new feature. It’s worth to mention that traditional Android I/O benchmarks won’t be able to show this as as those tend to test read speeds with the files they’ve just created." Given the cost of SRAM in embedded controllers I think there is a strong incentive for manufacturers of flash storage devices to reduce the amount of SRAM on the storage controller. I think this means that proposals to use host memory for caching L2P mappings will keep popping up, no matter what we tell the storage controller vendors about what we think about such a design. Bart.
Re: Another approach of UFSHPB
On 2020-05-20 10:55, Christoph Hellwig wrote: > HPB is a completely fucked up concept and we shoud not merge it at all. > Especially not with a crazy bullshit vendor extension layer that makes > it even easier for vendors to implement even worse things than the > already horrible spec says. Just stop this crap and implement sane > interfaces for the next generation hardware instead of wasting your > time on this idiotic idea. Hi Christoph, What exactly is it that you are not happy about? Is it the concept of using host memory to store L2P translation information or how that concept has been translated into SCSI commands (HPB READ BUFFER, HPB READ and HPB WRITE BUFFER)? In the former case: aren't Open-Channel SSDs another example of storage devices for which the L2P translation tables are maintained in host memory? Didn't the driver for Fusion-io SSDs also maintain the L2P mapping in host memory? Do you agree that HPB UFS storage devices are already being used widely and hence that not accepting this functionality in the upstream kernel will force users of HPB devices to maintain HPB code outside the kernel tree? Isn't one of the goals of the Linux kernel project to increase its user base? Bart.
Re: Another approach of UFSHPB
Serious, HPB is a completely fucked up concept and we shoud not merge it at all. Especially not with a crazy bullshit vendor extension layer that makes it even easier for vendors to implement even worse things than the already horrible spec says. Just stop this crap and implement sane interfaces for the next generation hardware instead of wasting your time on this idiotic idea.
Another approach of UFSHPB
Hello, All! I write this email that is my opinion regard to HPB. This mail is very long. So, summary is ... : [Summary] 1. HPB, Write Booster and other new UFS features would be managed by ufs-extended feature layer (additional LLD layer) 2. Map table of HPB will be managed in LLD (UFS Driver) 3. Each ufs feature driver can be implemented as a loadable kernel module 4. Samsung prepares test environment based on QEMU also 5. We will push our patch as soon! Please review this. This is body: I am Yongmyung Lee, who is one of the original authors of the HPB driver, which was out-of-tree implementation from linux kernel mainline. I have seen the works done by Bean Hou and Avri Altman, which are very impressive. I also have some opinion on the driver and would like to consult with you about it and prepare a set of patches. The basic idea of my approach is similar to Bean's work of managing HPB map-data in LLD (UFS Driver). As HPB Driver is closely related to the behavior of the UFS Device, I think it would be nice if HPB Driver runs on the LLD layer. HPB Driver is closely related to the behavior of the UFS Device. Thus I think it would be nice if HPB Driver works on additional feature layer on LLD. Of course, Avri's work makes sense. I believe HPB's scheme can be a good alternative to HMB(Host Memory Buffer)of NVMe for storage devices like eMMC or SATA SSD (may be DRAMless). Once HPB-like scheme is applied, these storage devices could benefit from performance improvement in read latency as well as throughput. However, it would not be desirable to relocate the map table management to the upper layer such as scsi-dh as there are no such specification supported for the moment. When the similar specifications to HPB will be proposed in JEDEC or T13, it would be a great chance to have a deep discussion on which layer to be the best fit for the map table management. Before disclosing my opinion on the HPB driver development approach, I would like to introduce the main characteristics of Extended UFS-Feature in progress: * Providing extended functionality. That is, UFS device shall work well without extended UFS-Features. By using them, however, it will be more powerful in certain application scenarios. * Designed to suit the Embedded system (such as Android) Due to the characteristics of UFS device, it is being utilized mainly as a mobile storage. We are mainly trying to improve performance in Android system. * Closely connected to the UFS devices. The Extended UFS-Features use UFS query requests which are UFS private commands. In addition, other features are closely related with electrical and physical characteristics of UFS (such as Voltage, Gear-scaling and Runtime suspend/resume etc.,) and should be taken into consideration Therefore, it is desirable to have those features managed separately from the original UFS driver. Current extended UFS-Features which are approved by the JEDEC are as below: * UFS HPB * UFS WriteBooster In addition to that, we are developing and testing various UFS-features that would be beneficial in the Embedded system (At the moment, most of them would be Android system). As we have already successfully provided enhanced features in cooperation with major Android Phone vendors, now we willingly want to contribute our works to the Linux kernel mainline. Currently, UFS driver (usually ufshcd.c) has become bulky and complex. So, I would like to split these codes into layers like the works of Bean Huo and Avril Altman. Especially, I suggest the UFS-Feature Driver model based on Linux Bus-Driver Model, which is suitable to manage all Extended UFS-Feature drivers like the Figure as below: UFS Driver data structure (struct ufs_hba) | |----- ufshpb driver -- <- attach ufshpb device driver (it can be loadable) |---| ufs-ext feature layer | -- ufs-wb driver -- <- attach ufswb device driver | | | -- ... -- <- ... |----- next ufs feature driver -- <- attach ufs-next feature driver * wb : write-booster Each extended UFS-Feature Driver has a bus of ufs-ext feature type and it is bound to the ufs-ext feature layer. The ufs-ext feature layer manages common APIs used by each Extended UFS-Feature Driver. The APIs consist of UFS Query request and UFS Vendor commands related to each ufs-ext feature driver. Furthermore, each ufs-ext feature driver will be written as a loadable kernel module. Vendors (e.g., Android Phone manufacturer) could optionally load and remove each module. Also they can customize the parameters of ufs-ext feature drivers while each module is being loaded. (For example, vendor would set the maximum memory size that can be reclaimed in the Host Control mode in HPB) We are expecting that this kind of standardized and abstracted model will surely make it easier to m