Hi All, 

We have been facing some issues in disperse (EC) volume. 
We know that currently EC is not good for random IO as it requires 
READ-MODIFY-WRITE fop 
cycle if an offset and offset+length falls in the middle of strip size. 

Unfortunately, it could also happen with sequential writes. 
Consider an EC volume with configuration 4+2. The stripe size for this would be 
512 * 4 = 2048. That is, 2048 bytes of user data stored in one stripe. 
Let's say 2048 + 512 = 2560 bytes are already written on this volume. 512 Bytes 
would be in second stripe. 
Now, if there are sequential writes with offset 2560 and of size 1 Byte, we 
have to read the whole stripe, encode it with 1 Byte and then again have to 
write it back. 
Next, write with offset 2561 and size of 1 Byte will again READ-MODIFY-WRITE 
the whole stripe. This is causing bad performance. 

There are some tools and scenario's where such kind of load is coming and users 
are not aware of that. 
Example: fio and zip 

Solution: 
One possible solution to deal with this issue is to keep last stripe in memory. 
This way, we need not to read it again and we can save READ fop going over the 
network. 
Considering the above example, we have to keep last 2048 bytes (maximum) in 
memory per file. This should not be a big 
deal as we already keep some data like xattr's and size info in memory and 
based on that we take decisions. 

Please provide your thoughts on this and also if you have any other solution. 

--- 
Ashish 


_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Reply via email to