On 2015/1/27 15:08, Srinivas Eeda wrote: > Hi Yangwenfang, > > thank you very much for initiating this RFC :). This feature is long due for > OCFS2 and we are also interested in implementing this feature. Wengang(cc'ed) > has been looking into analysing and giving an attempt to implement it. We > haven't looked at splitting and merging the range locking yet, but looked at > having lock fairness and range locking. Wengang has done some of the dlm > changes to see how it can be done but other changes are still work in > progress. We will email more details in coming few days. > > Since you are also looking into it, it would be great if we can collaborate > work on this feature. Can you please share more info on the demo code you > mentioned ? Like what it does and how much work has been done on this ? > Hi, About 6k lines of code was modified including dlmglue and dlm in our demo.
code modification: 1.read/write IO: get the range(start, end) and call ocfs2_range_lock. 2.dlmglue: modify key data struct: each inode has one ocfs2_lock_res including many range locks which have different range. determine the existance of conflicts betwen multiple threads within the node. manage the cache of range lock to support unlock-delay. 3.dlm: determine the existance of conflicts betwen multiple nodes. add splitting and merging the range locking. 4.lib: interval tree. > One of the thing we considered was making the rw lock itself support range > locking, which is a different approach from what you mentioned. Is there any > reason why rw lock cannot be used and we needa new ip_range_lock_lockres ? > RW lock can be used, but it is complicated to add the feature to rw_lock because RW lock is also applicated in read/write/truncate. Byte range lock is only beneficial for update write, so I just modify write IO to finish the demo to get performance results as soon as possible. I think ocfs2_rw_lock(pr) + ocfs2_range_lock(start, end, ex) are equivalent to ocfs2_rw_lock(ex);am I rigth? > Thanks, > --Srini > > > Hi On 01/26/2015 04:28 AM, yangwenfang wrote: >> What: >> Byte range lock is applied to lock a region of a file to accelerate >> reading/writing concurrently. >> >> Why: >> Currently ocfs2 does not support byte range lock. Since multiple nodes >> may concurrently update/write at different positions of the same file >> in database workloads, the performance(tpmc) of DB+ocfs2 is much poorer than >> DB+GPFS in running TPCC. >> Aiming at improving the efficiency of parallel accesses to the same file, >> we have implemented a demo of range lock feature which has been supported >> by lustre and GPFS, so that a file can be updated by different nodes in >> the cluster when they are visiting different blocks. >> >> How: >> Key issues in design and implementation: >> 1.In ocfs2, each file only has one lock, which is incapable of telling >> different position. >> One solution is to add a range field (start,end) in a lock. For example: >> -ocfs2_lock_res(N1) dlm_lock_resource(Master) ocfs2_lock_res(N2) >> -ocfs2_res_range_lock (0,9)----dlm_lock(0,9) N1 >> - dlm_lock(10,19) N2<--ocfs2_res_range_lock(10,19) >> -ocfs2_res_range_lock (20,29)---dlm_lock(20,29) N1 >> - dlm_lock(30,49) N2<--ocfs2_res_range_lock(30,49) >> -ocfs2_res_range_lock (50,59)---dlm_lock(50,59) N1 >> - dlm_lock(60,69) N2<--ocfs2_res_range_lock(60,69) >> >> Each lock resource deploys an interval tree to manage the range, which >> supports basic operations like add, delete, insert, find, split and merge. >> The most important issue is to determine the existance of conflicts >> among the ranges. Conflict-free ranges of the same file can be accessed >> concurrently. In the contrary, nodes must wait for the release of a >> conflicted lock before accessing the range of file. >> >> Byte range lock supports split and merge rules: for same level, larger >> scope; different level, write > read(If a node keeps EX lock with >> range(start,end), then it has PR range lock(start,end)). >> For example: >> (1) merge: N1 keeps range lock (0,9)PR and (5,19)PR, the lock is merged into >> (0,19) PR; >> (2) merge: N1 keeps range lock (0,9)PR and (5,19)EX, the merged lock should >> become(0,19) PR, (5,19)EX; >> (3) split: N1 keeps range lock (0,9)PR, N2 tries to lock(0,5) PR, N1 should >> split the lock and keep (6,9)PR. >> >> 2.In ocfs2, there are only three types of lock resources: rw, inode and open >> which provide protections to different contents. >> We need to add another lock resource(ip_range_lock_lockres) to protect >> different ranges in IO read/write process. >> For example: buffer read/write. >> (1)ocfs2_file_aio_write ------------->ocfs2_file_aio_write >> ocfs2_rw_lock(ex) ocfs2_rw_lock(pr) >> ocfs2_range_lock(start, end, ex) >> ocfs2_write_begin >> ocfs2_inode_lock(ex) ocfs2_inode_lock(pr) >> if append, update to ex; >> (2)ocfs2_file_aio_read---------------> no need to change. >> ocfs2_readpage >> ocfs2_inode_lock(pr) >> (3)but it is a problem in read_ahead. >> ocfs2_readpages------------------>ocfs2_readpages >> ocfs2_inode_lock(pr) ocfs2_inode_lock(pr) >> ocfs2_range_lock(start, end, pr) >> >> Limitations based on our assumption: >> 1.Byte range lock is only beneficial for update write. >> 2.Too many locks because of delayed unlock. >> 3.Significant source code modification is necessitated, involving almost the >> whole dlmglue and dlm modules. >> >> As described above, there are also many limitations base on our assumption. >> Many thanks for any advice. >> >> thanks. >> > > > . > _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel