----- Original Message ----- > From: "Venky Shankar" <vshan...@redhat.com> > To: "Aravinda" <avish...@redhat.com> > Cc: "Shyam" <srang...@redhat.com>, "Krutika Dhananjay" <kdhan...@redhat.com>, > "Gluster Devel" <gluster-devel@gluster.org> > Sent: Thursday, September 3, 2015 8:29:37 AM > Subject: Re: [Gluster-devel] Gluster Sharding and Geo-replication
> On Wed, Sep 2, 2015 at 11:39 PM, Aravinda <avish...@redhat.com> wrote: > > > > On 09/02/2015 11:13 PM, Shyam wrote: > >> > >> On 09/02/2015 10:47 AM, Krutika Dhananjay wrote: > >>> > >>> > >>> > >>> ------------------------------------------------------------------------ > >>> > >>> *From: *"Shyam" <srang...@redhat.com> > >>> *To: *"Aravinda" <avish...@redhat.com>, "Gluster Devel" > >>> <gluster-devel@gluster.org> > >>> *Sent: *Wednesday, September 2, 2015 8:09:55 PM > >>> *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication > >>> > >>> On 09/02/2015 03:12 AM, Aravinda wrote: > >>> > Geo-replication and Sharding Team today discussed about the > >>> approach > >>> > to make Sharding aware Geo-replication. Details are as below > >>> > > >>> > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay > >>> Bellur > >>> > > >>> > - Both Master and Slave Volumes should be Sharded Volumes with > >>> same > >>> > configurations. > >>> > >>> If I am not mistaken, geo-rep supports replicating to a non-gluster > >>> local FS at the slave end. Is this correct? If so, would this > >>> limitation > >>> not make that problematic? > >>> > >>> When you state *same configuration*, I assume you mean the sharding > >>> configuration, not the volume graph, right? > >>> > >>> That is correct. The only requirement is for the slave to have shard > >>> translator (for, someone needs to present aggregated view of the file to > >>> the READers on the slave). > >>> Also the shard-block-size needs to be kept same between master and > >>> slave. Rest of the configuration (like the number of subvols of DHT/AFR) > >>> can vary across master and slave. > >> > >> > >> Do we need to have the sharded block size the same? As I assume the file > >> carries an xattr that contains the size it is sharded with > >> (trusted.glusterfs.shard.block-size), so if this is synced across, it > >> would > >> do. If this is true, what it would mean is that "a sharded volume needs a > >> shard supported slave to ge-rep to". > > > > Yes. Number of bricks and replica count can be different. But sharded block > > size should be same. Only the first file will have > > xattr(trusted.glusterfs.shard.block-size), Geo-rep should sync this xattr > > also to Slave. Only Gsyncd can read/write the sharded chunks. Sharded Slave > > Volume is required to understand these chunks when read(non Gsyncd clients) > Even if this works I am very much is disagreement with this mechanism > of synchronization (not that I have a working solution in my head as > of now). Hi Venky, It is not apparent to me what issues you see with approach 2. If you could lay them out here, it would be helpful in taking the discussions further. -Krutika > > > >> > >>> > >>> -Krutika > >>> > >>> > >>> > >>> > - In Changelog record changes related to Sharded files also. Just > >>> like > >>> > any regular files. > >>> > - Sharding should allow Geo-rep to list/read/write Sharding > >>> internal > >>> > Xattrs if Client PID is gsyncd(-1) > >>> > - Sharding should allow read/write of Sharded files(that is in > >>> .shards > >>> > directory) if Client PID is GSYNCD > >>> > - Sharding should return actual file instead of returning the > >>> > aggregated content when the Main file is requested(Client PID > >>> > GSYNCD) > >>> > > >>> > For example, a file f1 is created with GFID G1. > >>> > > >>> > When the file grows it gets sharded into chunks(say 5 chunks). > >>> > > >>> > f1 G1 > >>> > .shards/G1.1 G2 > >>> > .shards/G1.2 G3 > >>> > .shards/G1.3 G4 > >>> > .shards/G1.4 G5 > >>> > > >>> > In Changelog, this is recorded as 5 different files as below > >>> > > >>> > CREATE G1 f1 > >>> > DATA G1 > >>> > META G1 > >>> > CREATE G2 PGS/G1.1 > >>> > DATA G2 > >>> > META G1 > >>> > CREATE G3 PGS/G1.2 > >>> > DATA G3 > >>> > META G1 > >>> > CREATE G4 PGS/G1.3 > >>> > DATA G4 > >>> > META G1 > >>> > CREATE G5 PGS/G1.4 > >>> > DATA G5 > >>> > META G1 > >>> > > >>> > Where PGS is GFID of .shards directory. > >>> > > >>> > Geo-rep will create these files independently in Slave Volume and > >>> > syncs Xattrs of G1. Data can be read only when all the chunks are > >>> > synced to Slave Volume. Data can be read partially if main/first > >>> file > >>> > and some of the chunks synced to Slave. > >>> > > >>> > Please add if I missed anything. C & S Welcome. > >>> > > >>> > regards > >>> > Aravinda > >>> > > >>> > On 08/11/2015 04:36 PM, Aravinda wrote: > >>> >> Hi, > >>> >> > >>> >> We are thinking different approaches to add support in > >>> Geo-replication > >>> >> for Sharded Gluster Volumes[1] > >>> >> > >>> >> *Approach 1: Geo-rep: Sync Full file* > >>> >> - In Changelog only record main file details in the same brick > >>> >> where it is created > >>> >> - Record as DATA in Changelog whenever any addition/changes > >>> to the > >>> >> sharded file > >>> >> - Geo-rep rsync will do checksum as a full file from mount and > >>> >> syncs as new file > >>> >> - Slave side sharding is managed by Slave Volume > >>> >> *Approach 2: Geo-rep: Sync sharded file separately* > >>> >> - Geo-rep rsync will do checksum for sharded files only > >>> >> - Geo-rep syncs each sharded files independently as new files > >>> >> - [UNKNOWN] Sync internal xattrs(file size and block count) > >>> in the > >>> >> main sharded file to Slave Volume to maintain the same state as > >>> in Master. > >>> >> - Sharding translator to allow file creation under .shards > >>> dir for > >>> >> gsyncd. that is Parent GFID is .shards directory > >>> >> - If sharded files are modified during Geo-rep run may end up > >>> stale > >>> >> data in Slave. > >>> >> - Files on Slave Volume may not be readable unless all sharded > >>> >> files sync to Slave(Each bricks in Master independently sync > >>> files to > >>> >> slave) > >>> >> > >>> >> First approach looks more clean, but we have to analize the Rsync > >>> >> checksum performance on big files(Sharded in backend, accessed > >>> as one > >>> >> big file from rsync) > >>> >> > >>> >> Let us know your thoughts. Thanks > >>> >> > >>> >> Ref: > >>> >> [1] > >>> >> > >>> > >>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator > >>> >> -- > >>> >> regards > >>> >> Aravinda > >>> >> > >>> >> > >>> >> _______________________________________________ > >>> >> Gluster-devel mailing list > >>> >> Gluster-devel@gluster.org > >>> >> http://www.gluster.org/mailman/listinfo/gluster-devel > >>> > > >>> > > >>> > > >>> > _______________________________________________ > >>> > Gluster-devel mailing list > >>> > Gluster-devel@gluster.org > >>> > http://www.gluster.org/mailman/listinfo/gluster-devel > >>> > > >>> _______________________________________________ > >>> Gluster-devel mailing list > >>> Gluster-devel@gluster.org > >>> http://www.gluster.org/mailman/listinfo/gluster-devel > >>> > >>> > > > > regards > > Aravinda > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel