[Gluster-devel] RFC: d_off encoding at client/protocol layer

Shyam Mon, 26 Jan 2015 11:59:31 -0800

Hi,

Some parts of this topic has been discussed in the recent past here [1]

The current mechanism of each xlator encoding the subvol in the lower orhigher bits has its pitfalls as discussed in the threads and in thisreview, here [2]

Here is a solution design from the one of the comments posted on this byAvati here, [3], as in,

"One example approach (not necessarily the best): Make every xlatorknows the total number of leaf xlators (protocol/clients), and also thenumber of all leaf xlators from each of its subvolumes. This way, theprotocol/client xlators (alone) do the encoding, by knowing its globalbrick# and total #of bricks. The cluster xlators blindly forward thereaddir_cbk without any further transformations of the d_offs, and alsoroute the next readdir(old_doff) request to the appropriate subvolumebased on the weighted graph (of counts of protocol/clients in thesubtrees) till it reaches the right protocol/client to resume theenumeration."


So the current proposed scheme that is being worked on is as follows,

- encode the d_off with the client/protocol ID, which is generated asits leaf position/number

- no further encoding in any other xlator

- on receiving further readdir requests with the d_off, consult the,graph/or immediate children, on ID encoded in the d_off, and send therequest down that subvol path

IOW, given a d_off and a common routine, pass the d_off with this (i.ecurrent xlator) to get a subvol that the d_off belongs to. This routinewould decode the d_off for the leaf ID as encoded in the client/protocollayer, and match its subvol relative to this and send that for furtherprocessing. (it may consult the graph or store the range of IDs that anysubvol has w.r.t client/protocol and deliver the result appropriately).

Given the current situation of ext4 and xfs, and continuing with the IDencoding scheme, this seems to be the best manner of preventing multipleencoding of subvol stomping on each other, and also preserving (in asense) further loss of bits. This scheme would also give AFR/EC theability to load balance readdir requests across its subvols better, thanhave a static subvol to send to for a longer duration.


Thoughts/comments?

Shyam

[1] https://www.mail-archive.com/[email protected]/msg02834.html
[2] review.gluster.org/#/c/8201/4/xlators/cluster/afr/src/afr-dir-read.c
[3] https://www.mail-archive.com/[email protected]/msg02847.html
_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] RFC: d_off encoding at client/protocol layer

Reply via email to