On Fri, Jan 13, 2017 at 8:03 PM, Xavier Hernandez <xhernan...@datalab.es> wrote:
> Hi, > > On 13/01/17 10:58, jayakrishnan mm wrote: > >> Hi Xavier, >> I went through the source code. Some questions remain. >> >> 1. If two clients try to write to same file, it should succeed, even if >> they overlap. (Locks should ensure it happens in sequence, in the bricks). >> from the source code >> lock->flock.l_type = F_WRLCK; >> lock->flock.l_whence = SEEK_SET; >> >> fop->flock.l_len += ec_adjust_offset(fop->xl->private, >> &fop->flock.l_start, 1); >> fop->flock.l_len = ec_adjust_size(fop->xl->private, >> fop->flock.l_len, 1); >> if flock.l_len is 0, the entire file is locked for writing >> >> In my test case with 2 clients, I always get flock.l_len as 0. But >> still I am able to write to the same file from both clients at the >> same time. >> > > How are you sure you are really writing at the same time ? do you get > partial writes from some of the client ? I am not sure, if they are happening simultaneously. I am using fio to do that. > > > >> If it is acquiring lock chunk by chunk, why I am getting l_len =0 >> always ? >> > > EC doesn't acquire partial locks. The entire file is locked when a > modification is needed. This makes possible to reuse locks for future > operations (eager locking). > > Why I am not getting the actual write size and offset f(for >> flock.l_len & flock.l_start respectively) for each write FOP ? >> (In afr , it is set to transaction.len transaction.start respectively, >> which in turn is write length & offset for the normal write case) >> > > Because an erasure code splits the data is smaller fragments for each > brick, so offsets and lengths need to be adjusted. > > >> 2. As per source code ,a full file lock is taken by the shd also. >> >> ec_heal_inodelk(heal, F_WRLCK, 1, 0, 0); >> which means offset=0 & size=0 in ec_heal_lock() function in ec-heal.c >> flock.l_start = offset; >> flock.l_len = size; >> Does it mean , in a single file write cannot happen simultaneously with >> healing? >> > > Correct. Heal procedure is like an additional client. If a client and the > heal process try to write at the same time, they must be serialized, like > any other regular write. However heal only takes the full lock for some > critical operations. Regular self heal of file contents is done locking > chunk by chunk. > Have got a question about index heal/full heal. As per the code, index healer thread (ec_shd_index_healer)is created when there is a child_up event OR when there is a TRANSLATOR_OP/GF_SHD_OP_HEAL_INDEX. When does the second case arise ? Full heal thread(ec_shd_full_healer) is created only when TRANSLATOR_OP/GF_SHD_OP_HEAL_FULL arise. Does this happen during replace brick condition only ? Thanks & regards JK > > Xavi > > >> Correct me , if I am wrong. >> >> Best Regards >> JK >> >> >> >> >> >> >> On Wed, Dec 14, 2016 at 12:07 PM, jayakrishnan mm >> <jayakrishnan...@gmail.com <mailto:jayakrishnan...@gmail.com>> wrote: >> >> Thanks Xavier, for making it clear. >> Regards >> JK >> >> >> On Dec 13, 2016 3:52 PM, "Xavier Hernandez" <xhernan...@datalab.es >> <mailto:xhernan...@datalab.es>> wrote: >> >> Hi JK, >> >> >> On 12/13/2016 08:34 AM, jayakrishnan mm wrote: >> >> Dear Xavi, >> >> How do I test the locks, for example locks for write fop. >> I have two >> clients(independent), both are trying to write to same file. >> >> >> 1. According to my understanding, both can successfully >> write if the >> offsets don't overlap . I mean, the WRITE FOP takes a chunk >> lock on the >> file . As >> long as the clients don't try to write to the same chunk, >> it should be >> OK. If no locks present, it can lead to inconsistency. >> >> >> With locks all writes will be fine as defined by posix (i.e. the >> final result will be equivalent to the sequential execution of >> both operations, though in an undefined order), even if they >> overlap. Without locks, there are chances that some bricks >> execute the operations in one order and the remaining bricks >> execute the same operations in the reverse order, causing data >> corruption. >> >> >> >> >> 2. Different FOPs can always run simultaneously. (Example >> WRITE and >> READ FOPs, or two READ FOPs). >> >> >> All fops can be executed concurrently. If there's any chance >> that two operations could interfere, locks are taken in the >> appropriate places. For example, reads cannot be merged with >> overlapping writes. Otherwise they could return inconsistent data. >> >> >> >> 3. WRITE & some metadata FOP (like setattr) together . >> Cannot happen >> together with locks , even though chances are very low. >> >> >> As in 2, if there's any possible interference, the appropriate >> locks will be taken. >> >> You can look at the code to see which locks are taken for each >> fop. See the corresponding ec_manager_<fop>() function, in the >> EC_STATE_LOCK switch case. There you will see calls to >> ec_lock_prepare_xxx() for each taken lock. >> >> Xavi >> >> >> Pls. clarify. >> >> Best regards >> JK >> >> >> >> On Wed, Nov 30, 2016 at 5:49 PM, jayakrishnan mm >> <jayakrishnan...@gmail.com >> <mailto:jayakrishnan...@gmail.com> >> <mailto:jayakrishnan...@gmail.com >> <mailto:jayakrishnan...@gmail.com>>> wrote: >> >> Hi Xavier, >> >> Thank you very much for your explanation. This helped me >> to >> understand more about locking in EC. >> >> Best Regards >> JK >> >> >> On Mon, Nov 28, 2016 at 4:17 PM, Xavier Hernandez >> <xhernan...@datalab.es <mailto:xhernan...@datalab.es> >> <mailto:xhernan...@datalab.es >> >> <mailto:xhernan...@datalab.es>>> wrote: >> >> Hi, >> >> On 11/28/2016 02:59 AM, jayakrishnan mm wrote: >> >> Hi Xavier, >> >> Notice that EC xlator uses blocking locks. Any >> specific >> reason for this? >> >> >> In a distributed filesystem like gluster a >> synchronization >> mechanism is a must to avoid data corruption. >> >> >> Do you think this will affect the performance ? >> >> >> Of course the need for locks has a performance >> impact, and we >> cannot avoid them to guarantee data integrity. >> However some >> optimizations have been applied, specially the eager >> locking >> which allows a lock to be reused without >> unlocking/locking again. >> >> >> (In comparison AFR first tries non blocking >> locks and if not >> successful, tries blocking locks then) >> >> >> EC also tries a non-blocking lock first. >> >> >> Also, why two locks are needed per FOP ? One >> for normal >> I/O and >> another for self healing? >> >> >> The only fop that currently needs two locks is >> 'rename', and >> only when source and destination directories are >> different. All >> other fops only take one lock at most. >> >> Best regards, >> >> Xavi >> >> >> Best regards >> JK >> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> <mailto:Gluster-devel@gluster.org> >> <mailto:Gluster-devel@gluster.org >> <mailto:Gluster-devel@gluster.org>> >> >> http://www.gluster.org/mailman/listinfo/gluster-devel >> <http://www.gluster.org/mailman/listinfo/gluster-devel> >> >> <http://www.gluster.org/mailman/listinfo/gluster-devel >> <http://www.gluster.org/mailman/listinfo/gluster-devel>> >> >> >> >> >> >> >> >> >
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel