On 11/13, Hyunchul Lee wrote: > On 11/13/2017 10:59 AM, Chao Yu wrote: > > On 2017/11/13 9:35, Hyunchul Lee wrote: > >> On 11/13/2017 10:26 AM, Chao Yu wrote: > >>> On 2017/11/13 8:24, Hyunchul Lee wrote: > >>>> On 11/10/2017 03:42 PM, Chao Yu wrote: > >>>>> On 2017/11/10 8:23, Hyunchul Lee wrote: > >>>>>> Hello, Chao > >>>>>> > >>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote: > >>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote: > >>>>>>>> From: Hyunchul Lee <cheol....@lge.com> > >>>>>>>> > >>>>>>>> Using write hints[1], applications can inform the life time of the > >>>>>>>> data > >>>>>>>> written to devices. and this[2] reported that the write hints patch > >>>>>>>> decreased writes in NAND by 25%. > >>>>>>>> > >>>>>>>> This hints help F2FS to determine the followings. > >>>>>>>> 1) the segment types where the data will be written. > >>>>>>>> 2) the hints that will be passed down to devices with the data of > >>>>>>>> segments. > >>>>>>>> > >>>>>>>> This patch set implements the first mapping from write hints to > >>>>>>>> segment types > >>>>>>>> as shown below. > >>>>>>>> > >>>>>>>> hints segment type > >>>>>>>> ----- ------------ > >>>>>>>> WRITE_LIFE_SHORT CURSEG_COLD_DATA > >>>>>>>> WRITE_LIFE_EXTREME CURSEG_HOT_DATA > >>>>>>>> others CURSEG_WARM_DATA > >>>>>>>> > >>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this > >>>>>>>> hints, And > >>>>>>>> hints are not applied in in-place update. > >>>>>>> > >>>>>>> Could we change to disable IPU if file/inode write hint is existing? > >>>>>>> > >>>>>> > >>>>>> I am afraid that this makes side effects. for example, this could cause > >>>>>> out-of-place updates even when there are not enough free segments. > >>>>>> I can write the patch that handles these situations. But I wonder > >>>>>> that this is required, and I am not sure which IPU polices can be > >>>>>> disabled. > >>>>> > >>>>> Oh, As I replied in another thread, I think IPU just affects filesystem > >>>>> hot/cold separating, rather than this feature. So I think it will be > >>>>> okay > >>>>> to not consider it. > >>>>> > >>>>>> > >>>>>>>> > >>>>>>>> Before the second mapping is implemented, write hints are not passed > >>>>>>>> down > >>>>>>>> to devices. Because it is better that the data of a segment have the > >>>>>>>> same > >>>>>>>> hint. > >>>>>>>> > >>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35 > >>>>>>>> [2]: https://lwn.net/Articles/726477/ > >>>>>>> > >>>>>>> Could you write a patch to support passing write hint to block layer > >>>>>>> for > >>>>>>> buffered writes as below commit: > >>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for > >>>>>>> buffered writes") > >>>>>>> > >>>>>> > >>>>>> Sure I will. I wrote it already ;) > >>>>> > >>>>> Cool, ;) > >>>>> > >>>>>> I think that datas from the same segment should be passed down with > >>>>>> the same > >>>>>> hint, and the following mapping is reasonable. I wonder what is your > >>>>>> opinion > >>>>>> about it. > >>>>>> > >>>>>> segment type hints > >>>>>> ------------ ----- > >>>>>> CURSEG_COLD_DATA WRITE_LIFE_EXTREME > >>>>>> CURSEG_HOT_DATA WRITE_LIFE_SHORT > >>>>>> CURSEG_COLD_NODE WRITE_LIFE_NORMAL > >>>>> > >>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h? > >>>>> > >>>>>> CURSEG_HOT_NODE WRITE_LIFE_MEDIUM > >>>>> > >>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, > >>>>> then hot > >>>>> data, warm node, and cold node should be coldest. So I suggested we can > >>>>> define > >>>>> as below: > >>>>> > >>>>> META_DATA WRITE_LIFE_SHORT > >>>>> HOT_DATA & WARM_NODE WRITE_LIFE_MEDIUM > >>>>> HOT_NODE & WARM_DATA WRITE_LIFE_LONG > >>>>> COLD_NODE & COLD_DATA WRITE_LIFE_EXTREME > >>>>> > >>>> > >>>> I agree, But I am not sure that assigning the same hint to a node and > >>>> data > >>>> segment is good. Because NVMe is likely to write them in the same erase > >>>> block if they have the same hint. > >>> > >>> If we do not give the hint, they can still be written to the same erase > >>> block, > > > > I mean it's possible to write them to the same erase block. :) > > > >>> right? it will not be worse? > >>> > >> > >> If the hint is not given, I think that they could be written to > >> the same erase block, or not. But if we give the same hint, they are > >> written > >> to the same block. > > > > IMO, Only if underlying device can support more hint type or opened > > channels, > > and actual temperature of data segment and node segment is quite different, > > we > > can separate them. > > > > Okay, If Jaegeuk Kim agrees with this, I will submit the patch that > implements your proposed mapping.
How about this? We'd better to split data and node blocks as much as possible. segment type hints ------------ ----- COLD_NODE & COLD_DATA WRITE_LIFE_NONE WARM_DATA WRITE_LIFE_EXTERME HOT_NODE & WARM_NODE WRITE_LIFE_LONG HOT_DATA WRITE_LIFE_MEDIUM META_DATA WRITE_LIFE_SHORT > > Thank you for comments ;) > > > Thanks, > > > >> I am not sure ;) > >> > >>> Thanks, > >>> > >>>> > >>>> Thanks. > >>>> > >>>>> Thanks, > >>>>> > >>>>>> others WRITE_LIFE_NONE > >>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>>> > >>>>>>>> Hyunchul Lee (2): > >>>>>>>> f2fs: apply write hints to select the type of segments for buffered > >>>>>>>> write > >>>>>>>> f2fs: apply write hints to select the type of segment for direct > >>>>>>>> write > >>>>>>>> > >>>>>>>> fs/f2fs/data.c | 101 > >>>>>>>> ++++++++++++++++++++++++++++++++---------------------- > >>>>>>>> fs/f2fs/f2fs.h | 1 + > >>>>>>>> fs/f2fs/segment.c | 14 +++++++- > >>>>>>>> 3 files changed, 74 insertions(+), 42 deletions(-) > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> Thanks > >>>>>> > >>>>>> . > >>>>>> > >>>>> > >>>>> > >>>> > >>>> . > >>>> > >>> > >>> > >> > >> . > >> > > > >