Re: [Gluster-devel] State of the 4.0 World
- Original Message - > From: "Dan Lambright"> To: "Gluster Devel" > Sent: Saturday, June 11, 2016 12:42:49 AM > Subject: Re: [Gluster-devel] State of the 4.0 World > > > > - Original Message - > > From: "Jeff Darcy" > > To: "Gluster Devel" > > Sent: Tuesday, May 3, 2016 11:50:30 AM > > Subject: [Gluster-devel] State of the 4.0 World > > > > One of my recurring action items at community meetings is to report to > > the list on how 4.0 is going. So, here we go. > > > > The executive summary is that 4.0 is on life support. Many features > > were proposed - some quite ambitious. Many of those *never* had anyone > > available to work on them. Of those that did, many have either been > > pulled forward into 3.8 (which is great) or lost what resources they had > > (which is bad). Downstream priorities have been the biggest cause of > > those resource losses, though other factors such as attrition have also > > played a part. Net result is that, with the singular exception of > > GlusterD 2.0, progress on 4.0 has all but stopped. I'll provide more > > details below. Meanwhile, I'd like to issue a bit of a call to action > > here, in two parts. > > > > * Many of the 4.0 sub-projects are still unstaffed. Some of them are > >in areas of code where our combined expertise is thin. For example, > >"glusterfsd" is where we need to make many brick- and > >daemon-management changes for 4.0, but it has no specific maintainer > >other than the project architects so nobody touches it. Over the > >past year it has been touched by fewer than two patches per month, > >mostly side effects of patches which were primarily focused elsewhere > >(less than 400 lines changed). It can be challenging to dive into > >such a "fallow" area, but it can also be an opportunity to make a big > >difference, show off one's skill, and not have to worry much about > >conflicts with other developers' changes. Taking on projects like > >these is how people get from contributing to leading (FWIW it's how I > >did), so I encourage people to make the leap. > > > > * I've been told that some people have asked how 4.0 is going to affect > >existing components for which they are responsible. Please note that > >only two components are being replaced - GlusterD and DHT. The DHT2 > >changes are going to affect storage/posix a lot, so that *might* be > >considered a third replacement. JBR (formerly NSR) is *not* going to > >replace AFR or EC any time soon. In fact, I'm making significant > >efforts to create common infrastructure that will also support > >running AFR/EC on the server side, with many potential benefits to > >them and their developers. However, just about every other component > >is going to be affected to some degree, if only to use the 4.0 > >CLI/volgen plugin interfaces instead of being hard-coded into their > >current equivalents. 4.0 tests are also expected to be based on > >Distaf rather than TAP (the .t infrastructure) so there's a lot of > >catch-up to be done there. In other cases there are deeper issues to > >be resolved, and many of those discussions - e.g. regarding quota or > >georep - have already been ongoing. There will eventually be a > >Gluster 4.0, even if it happens after I'm retired and looks nothing > >like what I describe below. If you're responsible for any part of > >GlusterFS, you're also responsible for understanding how 4.0 will > >affect that part. > > > > With all that said, I'm going to give item-by-item details of where we > > stand. I'll use > > > > http://www.gluster.org/community/documentation/index.php/Planning40 > > > > as a starting point, even though (as you'll see) in some ways it's out > > of date. > > > > * GlusterD 2 is still making good progress, under Atin's and Kaushal's > >leadership. There are designs for most of the important pieces, and > >a significant amount of code which we should be able to demo soon. > > > > * DHT2 had been making good progress for a while, but has been stalled > >recently as its lead developer (Shyam) has been unavailable. > >Hopefully we'll get him back soon, and progress will accelerate > >again. > > DHT-2 will consolidate metadata on a server. This has the potential to help > gluster's tiering implementation significantly, as it will not need to > replicate directories on both the hot and cold tier. Chatting with Shyam, > there appears to be three work items related to tiering and DHT-2. > > 1. > > An unmodified tiering translator "should" work with DHT-2. But to realize > DHT-2's benefits, the tiering translator would need to be modified so > metadata related FOPs are directed to only go to the tier on which the > metadata resides. > > 2. > > "metadata" refers to
Re: [Gluster-devel] tarissue.t spurious failure
on 3.8 branch ./tests/basic/afr/tarissue.t failing https://build.gluster.org/job/rackspace-regression-2GB-triggered/20965/consoleFull - Original Message - > From: "Ravishankar N"> To: "Krutika Dhananjay" > Cc: "Gluster Devel" > Sent: Thursday, May 19, 2016 6:21:42 PM > Subject: Re: [Gluster-devel] tarissue.t spurious failure > > On 05/19/2016 04:47 PM, Ravishankar N wrote: > > > > On 05/19/2016 04:44 PM, Krutika Dhananjay wrote: > > > > Also, I must add that I ran it in a loop on my laptop for about 4 hours and > it ran without any failure. > > There seems to be a genuine problem. The test was failing on my machine 1/4 > times on master. > > > Okay, so that was on an old machine having tar-1.23. Installed tar-1.29 on it > and ran the test 100 times, no failure. > Does anyone have a suggestion? Can the tar version be upgraded on the slaves? > The test that fails is the tar command because it spews out the warning due > to which the return value is not zero: > FAILED COMMAND: tar cvf /tmp/dir1.tar.gz /mnt/nfs/0/nfs/dir1 > tar: Removing leading `/' from member names tar: > /mnt/nfs/0/nfs/dir1/dir2/file6: file changed as we read it > -Ravi > > > > > > > > > > -Krutika > > On Thu, May 19, 2016 at 4:42 PM, Krutika Dhananjay < kdhan...@redhat.com > > wrote: > > > > tests/basic/afr/tarissue.t fails sometimes on jenkins centos slave(s). > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/20915/consoleFull > > -Krutika > > > > > > > ___ > Gluster-devel mailing list Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > > > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Worm translator not truly disabled by default?
Hi Vijay, The patch is http://review.gluster.org/#/c/14367/ we will merge it once the regression have passed. Regards, Joe - Original Message - > From: "Vijay Bellur" <vbel...@redhat.com> > To: "Karthik Subrahmanya" <ksubr...@redhat.com> > Cc: "Joseph Fernandes" <josfe...@redhat.com>, "Krutika Dhananjay" > <kdhan...@redhat.com>, "Atin Mukherjee" > <amukh...@redhat.com>, "Gluster Devel" <gluster-devel@gluster.org> > Sent: Thursday, May 19, 2016 12:00:06 PM > Subject: Re: [Gluster-devel] Worm translator not truly disabled by default? > > Hi Karthik, > > Would it be possible for you to backport Krutika's patch [1] to release-3.8? > > I am running tests with 3.8rc0 and am running into excessive logging > problems addressed by the patch. > > Thanks, > Vijay > > [1] http://review.gluster.org/#/c/14182/ > > On Wed, May 4, 2016 at 1:16 AM, Karthik Subrahmanya <ksubr...@redhat.com> > wrote: > > Thanks Krutika, Atin, Joseph for the inputs. I will send out a patch with > > this issue fixed. > > > > Regards, > > Karthik > > > > - Original Message - > >> From: "Joseph Fernandes" <josfe...@redhat.com> > >> To: "Karthik Subrahmanya" <ksubr...@redhat.com>, "Krutika Dhananjay" > >> <kdhan...@redhat.com> > >> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Atin Mukherjee" > >> <amukh...@redhat.com> > >> Sent: Wednesday, May 4, 2016 6:21:23 AM > >> Subject: Re: [Gluster-devel] Worm translator not truly disabled by > >> default? > >> > >> Well I completely agree with Krutika that doing a getxattr for every FOP > >> is > >> not required > >> if the worm or worm-file option is off. > >> > >> Karthik, > >> And you need to check if the worm or worm-file option is set, then only go > >> ahead and do the checking. > >> For now as the feature is experimental and the whole purpose is to provide > >> the WORM/Retention semantic > >> experience to user. > >> Later when the feature matures, Once the volume is changed to "Enterprise > >> WORM/Retention" Mode,there > >> would be no going back. > >> > >> Could you please send out a patch for this asap ? > >> > >> Regards, > >> Joe > >> > >> - Original Message - > >> > From: "Atin Mukherjee" <amukh...@redhat.com> > >> > To: "Karthik Subrahmanya" <ksubr...@redhat.com>, "Krutika Dhananjay" > >> > <kdhan...@redhat.com> > >> > Cc: "Gluster Devel" <gluster-devel@gluster.org> > >> > Sent: Tuesday, May 3, 2016 6:22:55 PM > >> > Subject: Re: [Gluster-devel] Worm translator not truly disabled by > >> > default? > >> > > >> > > >> > > >> > On 05/03/2016 05:10 PM, Karthik Subrahmanya wrote: > >> > > > >> > > > >> > > - Original Message - > >> > >> From: "Krutika Dhananjay" <kdhan...@redhat.com> > >> > >> To: "Joseph Fernandes" <josfe...@redhat.com>, "Karthik Subrahmanya" > >> > >> <ksubr...@redhat.com> > >> > >> Cc: "Gluster Devel" <gluster-devel@gluster.org> > >> > >> Sent: Tuesday, May 3, 2016 2:53:02 PM > >> > >> Subject: Worm translator not truly disabled by default? > >> > >> > >> > >> Hi, > >> > >> > >> > >> I noticed while testing that worm was sending in fgetxattr() fops as > >> > >> part > >> > >> of a writev() request from the parent, despite being disabled by > >> > >> default. > >> > >> > >> > > This is because of the new feature called "file level worm" which is > >> > > introduced in the worm > >> > > translator. This will allow to make individual files as worm/retained > >> > > by > >> > > setting the volume > >> > > option "worm-file-level". The files which are created when this option > >> > > is > >> > > enabled will have > >> > > an xattr called "trusted.worm_file". This is implemented because > >> > > unlike > >> > > read-only or
Re: [Gluster-devel] Requesting for review
Hi Niels, Can we have this patch merged as the review comments are fixed and I dont see any more changes required for this patch. Kruthika/Atin, Please let us know if there are any issues or concerns. Regards, Joe - Original Message - > From: "Karthik Subrahmanya"> To: "Krutika Dhananjay" , "Atin Mukherjee" > > Cc: "Gluster Devel" > Sent: Friday, May 6, 2016 3:52:58 PM > Subject: [Gluster-devel] Requesting for review > > Hi, > > Could you please review this fix? > http://review.gluster.org/#/c/14182/ > > Thanks, > Karthik > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] tests/performance/open-behind.t fails on NetBSD
./tests/performance/open-behind.t is failing continuously on 3.7.11 https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/16678/console ~Joe - Original Message - > From: "Atin Mukherjee"> To: "Ravishankar N" , "Gluster Devel" > > Sent: Monday, April 4, 2016 11:19:09 AM > Subject: Re: [Gluster-devel] tests/performance/open-behind.t fails on NetBSD > > Have you rebased your patch? open-behind.t is now marked bad. > > ~Atin > > On 04/04/2016 11:16 AM, Ravishankar N wrote: > > > > Test Summary Report > > --- > > ./tests/performance/open-behind.t (Wstat: 0 Tests: 18 Failed: 4) > > Failed tests: 15-18 > > > > > > https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/15553/consoleFull > > https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/15564/consoleFull > > https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/15577/consoleFull > > https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/15580/consoleFull > > > > > > > > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Worm translator not truly disabled by default?
Well I completely agree with Krutika that doing a getxattr for every FOP is not required if the worm or worm-file option is off. Karthik, And you need to check if the worm or worm-file option is set, then only go ahead and do the checking. For now as the feature is experimental and the whole purpose is to provide the WORM/Retention semantic experience to user. Later when the feature matures, Once the volume is changed to "Enterprise WORM/Retention" Mode,there would be no going back. Could you please send out a patch for this asap ? Regards, Joe - Original Message - > From: "Atin Mukherjee" <amukh...@redhat.com> > To: "Karthik Subrahmanya" <ksubr...@redhat.com>, "Krutika Dhananjay" > <kdhan...@redhat.com> > Cc: "Gluster Devel" <gluster-devel@gluster.org> > Sent: Tuesday, May 3, 2016 6:22:55 PM > Subject: Re: [Gluster-devel] Worm translator not truly disabled by default? > > > > On 05/03/2016 05:10 PM, Karthik Subrahmanya wrote: > > > > > > - Original Message - > >> From: "Krutika Dhananjay" <kdhan...@redhat.com> > >> To: "Joseph Fernandes" <josfe...@redhat.com>, "Karthik Subrahmanya" > >> <ksubr...@redhat.com> > >> Cc: "Gluster Devel" <gluster-devel@gluster.org> > >> Sent: Tuesday, May 3, 2016 2:53:02 PM > >> Subject: Worm translator not truly disabled by default? > >> > >> Hi, > >> > >> I noticed while testing that worm was sending in fgetxattr() fops as part > >> of a writev() request from the parent, despite being disabled by default. > >> > > This is because of the new feature called "file level worm" which is > > introduced in the worm > > translator. This will allow to make individual files as worm/retained by > > setting the volume > > option "worm-file-level". The files which are created when this option is > > enabled will have > > an xattr called "trusted.worm_file". This is implemented because unlike > > read-only or volume > > level worm where if the option on the volume is disabled, the entire > > translator will get > > disabled and you can perform any FOP on the files in that volume. But here > > if a file is once > > marked as worm-retained, it should not revert back to the normal state > > where we can change > > its contents even if the worm-file-level option is reset/disabled. So the > > xattr is set on the > > file and every time when a write, link, unlink, rename, or truncate fop > > comes it checks for > > the xattr. > I am not sure with what test Krutika observed it, but if any worm > tunable is not set then ideally we shouldn't hit it. I believe you set > this xattr only when worm-file-level is turned on but that's also > disabled by default. Krutika, could you confirm it? > > Hope it helps. > > > > Thanks & Regards, > > Karthik > >> > >> I've sent a patch for this at http://review.gluster.org/#/c/14182/ > >> I must admit I do not understand the internals of this new translator. > >> > >> Request your feedback/review. > >> > >> -Krutika > >> > > ___ > > Gluster-devel mailing list > > Gluster-devel@gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Gluster 3.8 : File level WORM/Retention
Hi All, I would like to Congratulate Karthik for introducing the "File level WORM/Retention" feature(Experimental in 3.8) in Gluster v3.8rc0 (http://review.gluster.org/#/c/13429/ patch merged) Would also like to thank Atin, Anoop CS, Vijay M, Niels, Raghavendra Talur and Prasanth Pai for helping Karthik in doing so (reviews and guidance) :) There are few of the action items that still remaining for 3.8 and should be done before 3.8 is released. Action Items before 3.8 release: Address review comments from Atin, Vijay and Raghavendra Talur, 1. Testing of effects of WORM Xlator positioning in the brick stack on other components like barrier(snapshots), Quotas. If there are any immediate bugs. Though in the later versions there will be a client side WORM-Cache Xlator, which will cache worm/retention states of file inodes and return back the appropriate errors. 2. Correction on the error path as Vijay has suggested. In file worm.c, you are doing unwind in all FOPs with errno as -1, which is wrong. Change the code something like below if (label == 0) goto wind; if (label == 1) op_errno = EROFS else if (label == 2) op_errno = ENOMEM Unwind here... goto out; wind: ret = 0; Wind here... out: return ret; 3. Talur's comment : Most of the functions in worm-helper need to have gf_worm prefix. 4. Caching the retention state in the xlator inode context (stretch goal for 3.8) Please feel free to add/update the list if I have missed something. Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Change in glusterfs[master]: WORM/Retention Translator: Implementation of file level WORM...
Karthik, Whenever you address a comment please mark it as done, this helps in tracking. Regards, Joe - Original Message - > From: "Karthik Subrahmanya" <ksubr...@redhat.com> > To: "Joseph Fernandes" <josfe...@redhat.com>, anoo...@redhat.com, "Atin > Mukherjee" <amukh...@redhat.com> > Cc: "Raghavendra Talur" <rta...@redhat.com>, "Vivek Agarwal" > <vagar...@redhat.com>, "Dan Lambright" > <dlamb...@redhat.com>, "Gaurav Kumar Garg" <gg...@redhat.com>, "Niels de Vos" > <nde...@redhat.com>, "Prashanth Pai" > <p...@redhat.com>, "mohammed rafi kc" <rkavu...@redhat.com>, "Vijaikumar > Mallikarjuna" <vmall...@redhat.com>, > "sankarshan" <sankars...@redhat.com>, "Avra Sengupta" <aseng...@redhat.com> > Sent: Thursday, April 28, 2016 4:37:17 PM > Subject: Re: Change in glusterfs[master]: WORM/Retention Translator: > Implementation of file level WORM... > > Hi all, > > I have addressed the review comments. Please review the patch. > > Thanks & Regards, > Karthik > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Compressing CTR DB
Discussion till now. ~Joe - Forwarded Message - > From: "Joseph Fernandes" <josfe...@redhat.com> > To: "Dan Lambright" <dlamb...@redhat.com> > Sent: Wednesday, April 27, 2016 9:24:57 PM > Subject: Re: Compressing CTR DB > > answers inline > > - Original Message - > > From: "Dan Lambright" <dlamb...@redhat.com> > > Sent: Wednesday, April 27, 2016 8:39:06 PM > > Subject: Re: Compressing CTR DB > > > > To compress the uuid, is it a lossless compression algorithm? So we won't > > lose any bits? > > > > We dont compress using any compression algos. Instead of saving the GFID > (which is a uuid) as printable string which takes 33-36 bytes > will save it as a 16 byte integer/blob, how it is represented in-memory > > > If we do an upgrade and change the schema, could we delete the old db and > > start a brand new one? An upgrade is a rare event? > > In a upgrade scenario we can do the following untill GFID to PATH Conversion > comes. > 1. Decommission the old DB and start using new DB > 2. The new DB will be healed in two ways, >a. Named lookup : Though the File heat table will be healed using any >other operation, the file link table (the one with multiple hardlinks) > will not be healed until and unless there is a nameslookup. >b. The Background heal from old to new via a separate thread in the brick. >YES there might be a performance hit, and this can be contained using > throttling mechanism. > > Again the question of how often a user upgrades ? might be its a rare event, > but stability shouldnt be affected. > > As discussed in the scrum lets speak to Aravinda and Shyam about this issue > of GFID to PATH Conversion next week, there is a proposal, but nothing > implemented and > functional as we have in DB. But yes we need to move it out of the DB as its > not why we got the DB. > > > > > Agree we need a version # as part of the solution. > > YES we will have a version of schema in the DB itself. > > > > > > > - Joseph Fernandes <josfe...@redhat.com> wrote: > > > Hi All, > > > > > > As I am working on shrinking the CTR DB Size, I came across few of the > > > articles/blogs on this. > > > As predicted, saving the UUID as 16 byte rather than 36 byte text will > > > give > > > us atleast 46% reduction > > > in disk and cache space. Plus The blog do suggest some performance > > > gain(if > > > we don't often convert UUID to String, whi). > > > > > > http://www.google.com/url?q=http%3A%2F%2Fwtanaka.com%2Fnode%2F8106=D=1=AFQjCNEZolVlLAW2OGxq96CFjfeY0mQC1A > > > https://scion.duhs.duke.edu/vespa/project/wiki/DatabaseUuidEfficiency > > > > > > The changes in the current libgfdb code is at the sqlite level. But since > > > there is a change in schema, we need to write > > > db data migration scripts during upgrades (Similar to dual connection > > > path). Speaking of which, we would need DB schema versions > > > and need to have it stored in gluster (either on glusterd or db or > > > namespace), As we will expect the schema to change as we fine > > > tune our heat store. > > > > > > Regards, > > > Joe > > > > > > > > > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] WORM patch review for 3.8
Hi Folks, Please review the WORM/Retention patch by Karthik, so that we can have it in 3.8 http://review.gluster.org/#/c/13429/ Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] glusterd crashing
http://www.gluster.org/community/documentation/index.php/Archives/Development_Work_Flow http://www.gluster.org/community/documentation/index.php/Simplified_dev_workflow Leaving the fun of exploration to you :) ~Joe - Original Message - > From: "Ajil Abraham" <ajil95.abra...@gmail.com> > To: "Atin Mukherjee" <atin.mukherje...@gmail.com> > Cc: "Joseph Fernandes" <josfe...@redhat.com>, "Gluster Devel" > <gluster-devel@gluster.org> > Sent: Saturday, March 5, 2016 10:06:37 PM > Subject: Re: [Gluster-devel] glusterd crashing > > Sure Atin. I am itching to contribute code. But worried due to lack of > experience in sending patches. Can somebody please send me across how to do > this? Consider me a total newbie and please be as descriptive as possible > :). > > -Ajil > > On Sat, Mar 5, 2016 at 12:46 PM, Atin Mukherjee <atin.mukherje...@gmail.com> > wrote: > > > -Atin > > Sent from one plus one > > On 05-Mar-2016 11:46 am, "Ajil Abraham" <ajil95.abra...@gmail.com> wrote: > > > > > > Thanks for all the support. After handling the input validation in my > > code, Glusterd no longer crashes. I am still waiting for clearance from my > > superior to pass on all the details. Expecting him to revert by this > > Sunday. > > Great to know that and we appreciate your contribution, if you happen to > > find any issues feel free to send patches :) > > > > > > - Ajil > > > > > > On Fri, Mar 4, 2016 at 10:20 AM, Joseph Fernandes <josfe...@redhat.com> > > wrote: > > >> > > >> Well that may not be completely correct ! > > >> > > >> Its "gluster volume status all", unlike volume maintenance operation > > which are rare. > > >> > > >> Status can be issued multiple times in a day or might be put in a > > script/cron-job to check the health of the > > >> cluster. > > >> But anyways the fix is ready as the bug says. > > >> > > >> Crash is what we need to worry about. > > >> > > >> ~Joe > > >> > > >> - Original Message - > > >> > From: "Atin Mukherjee" <amukh...@redhat.com> > > >> > To: "Joseph Fernandes" <josfe...@redhat.com>, "Atin Mukherjee" < > > atin.mukherje...@gmail.com> > > >> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Ajil Abraham" < > > ajil95.abra...@gmail.com> > > >> > Sent: Friday, March 4, 2016 9:37:43 AM > > >> > Subject: Re: [Gluster-devel] glusterd crashing > > >> > > > >> > > > >> > > > >> > On 03/04/2016 07:10 AM, Joseph Fernandes wrote: > > >> > > Might be this bug can give some context on the mem-leak (fix > > recently > > >> > > merged on master but not on 3.7.x) > > >> > > > > >> > > https://bugzilla.redhat.com/show_bug.cgi?id=1287517 > > >> > Yes, this is what we'd be fixing in 3.7.x too, but if you refer to [1] > > >> > the hike is seen when a command is run in a loop which is typically > > not > > >> > a use case in any production setup. > > >> > > > >> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1287517#c15 > > >> > > > > >> > > ~Joe > > >> > > > > >> > > > > >> > > - Original Message - > > >> > >> From: "Atin Mukherjee" <atin.mukherje...@gmail.com> > > >> > >> To: "Joseph Fernandes" <josfe...@redhat.com> > > >> > >> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Ajil Abraham" > > >> > >> <ajil95.abra...@gmail.com> > > >> > >> Sent: Friday, March 4, 2016 7:01:54 AM > > >> > >> Subject: Re: [Gluster-devel] glusterd crashing > > >> > >> > > >> > >> -Atin > > >> > >> Sent from one plus one > > >> > >> On 04-Mar-2016 6:12 am, "Joseph Fernandes" <josfe...@redhat.com> > > wrote: > > >> > >>> > > >> > >>> Hi Ajil, > > >> > >>> > > >> > >>> Well few things, > > >> > >>> > > >> > >>> 1. Whenever you see a crash its better to send a
[Gluster-devel] Warning while buidling "master"
Hi All, There is a warning while building master code Making install in fdl Making install in src CC fdl.lo CCLD fdl.la CC logdump.o CC libfdl.o CCLD gf_logdump CC recon.o CC librecon.o librecon.c: In function ‘fdl_replay_rename’: librecon.c:158:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_setxattr’: librecon.c:275:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_fsetxattr’: librecon.c:495:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c:492:9: warning: ‘dict’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (dict); ^ librecon.c: In function ‘fdl_replay_fremovexattr’: librecon.c:587:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_xattrop’: librecon.c:704:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_create’: librecon.c:827:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_discard’: librecon.c:916:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_writev’: librecon.c:1127:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c:1124:9: warning: ‘iobref’ may be used uninitialized in this function [-Wmaybe-uninitialized] iobref_unref (iobref); ^ librecon.c: In function ‘fdl_replay_fallocate’: librecon.c:1310:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_zerofill’: librecon.c:1584:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_link’: librecon.c:1690:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_fxattrop’: librecon.c:1810:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c:1807:9: warning: ‘dict’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (dict); ^ librecon.c: In function ‘fdl_replay_ftruncate’: librecon.c:1896:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_fsetattr’: librecon.c:2194:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ librecon.c: In function ‘fdl_replay_removexattr’: librecon.c:2283:9: warning: ‘xdata’ may be used uninitialized in this function [-Wmaybe-uninitialized] dict_unref (xdata); ^ CCLD gf_recon /usr/bin/mkdir -p '/usr/local/sbin' /bin/sh ../../../../libtool --quiet --mode=install /usr/bin/install -c gf_logdump gf_recon '/usr/local/sbin' /usr/bin/mkdir -p '/usr/local/lib/glusterfs/3.8dev/xlator/experimental' /bin/sh ../../../../libtool --quiet --mode=install /usr/bin/install -c fdl.la '/usr/local/lib/glusterfs/3.8dev/xlator/experimental' libtool: install: warning: relinking `fdl.la' Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] glusterd crashing
Well that may not be completely correct ! Its "gluster volume status all", unlike volume maintenance operation which are rare. Status can be issued multiple times in a day or might be put in a script/cron-job to check the health of the cluster. But anyways the fix is ready as the bug says. Crash is what we need to worry about. ~Joe - Original Message - > From: "Atin Mukherjee" <amukh...@redhat.com> > To: "Joseph Fernandes" <josfe...@redhat.com>, "Atin Mukherjee" > <atin.mukherje...@gmail.com> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Ajil Abraham" > <ajil95.abra...@gmail.com> > Sent: Friday, March 4, 2016 9:37:43 AM > Subject: Re: [Gluster-devel] glusterd crashing > > > > On 03/04/2016 07:10 AM, Joseph Fernandes wrote: > > Might be this bug can give some context on the mem-leak (fix recently > > merged on master but not on 3.7.x) > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1287517 > Yes, this is what we'd be fixing in 3.7.x too, but if you refer to [1] > the hike is seen when a command is run in a loop which is typically not > a use case in any production setup. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1287517#c15 > > > > ~Joe > > > > > > - Original Message - > >> From: "Atin Mukherjee" <atin.mukherje...@gmail.com> > >> To: "Joseph Fernandes" <josfe...@redhat.com> > >> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Ajil Abraham" > >> <ajil95.abra...@gmail.com> > >> Sent: Friday, March 4, 2016 7:01:54 AM > >> Subject: Re: [Gluster-devel] glusterd crashing > >> > >> -Atin > >> Sent from one plus one > >> On 04-Mar-2016 6:12 am, "Joseph Fernandes" <josfe...@redhat.com> wrote: > >>> > >>> Hi Ajil, > >>> > >>> Well few things, > >>> > >>> 1. Whenever you see a crash its better to send across the Backtrace(BT) > >> using gdb and attach the log files (or share it via some cloud drive) > >>> > >>> 2. About the memory leak, What kind of tools are you using for profiling > >> memory, valgrind ? if so please attach the valgrind reports. > >>>$> glusterd --xlator-option *.run-with-valgrind=yes > >>> > >>> 3. Well I am not sure if glusterd uses any of the mempools as we do in > >> client and brick processes, Atin can shed some light on this. > >>>Well In that case you can used the statedump mechanism check for > >> mem-leaks check the glusterfs/doc/debugging/statedump.md > >> GlusterD does use mempool and it has infra for capturing statedump as > >> well. > >> I am aware of few bytes of memory leaks in few paths which is really not a > >> huge concern but it shouldn't crash. > >>> > >>> Hope this helps > >>> > >>> ~Joe > >>> > >>> > >>> - Original Message - > >>>> From: "Ajil Abraham" <ajil95.abra...@gmail.com> > >>>> To: "Atin Mukherjee" <atin.mukherje...@gmail.com> > >>>> Cc: "Gluster Devel" <gluster-devel@gluster.org> > >>>> Sent: Thursday, March 3, 2016 10:48:56 PM > >>>> Subject: Re: [Gluster-devel] glusterd crashing > >>>> > >>>> Hi Atin, > >>>> > >>>> The inputs I use are as per the requirements of a project I am working > >> on for > >>>> one of the large finance institutions in Dubai. I will try to handle the > >>>> input validation within my code. I uncovered some of the issues while > >> doing > >>>> a thorough testing of my code. > >>>> > >>>> I tried with 3.7.6 and also my own build from master branch. I will > >> check > >>>> with my superiors before sending you backtrace and other details. So > >> far, I > >>>> have seen memory leak in 100s of KBs. > >>>> > >>>> -Ajil > >>>> > >>>> > >>>> On Thu, Mar 3, 2016 at 10:17 PM, Atin Mukherjee < > >> atin.mukherje...@gmail.com > >>>>> wrote: > >>>> > >>>> > >>>> > >>>> > >>>> Hi Ajil, > >>>> > >>>> Its good to see that you are doing a thorough testing gluster. From > >> your mail > >
Re: [Gluster-devel] glusterd crashing
Might be this bug can give some context on the mem-leak (fix recently merged on master but not on 3.7.x) https://bugzilla.redhat.com/show_bug.cgi?id=1287517 ~Joe - Original Message - > From: "Atin Mukherjee" <atin.mukherje...@gmail.com> > To: "Joseph Fernandes" <josfe...@redhat.com> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Ajil Abraham" > <ajil95.abra...@gmail.com> > Sent: Friday, March 4, 2016 7:01:54 AM > Subject: Re: [Gluster-devel] glusterd crashing > > -Atin > Sent from one plus one > On 04-Mar-2016 6:12 am, "Joseph Fernandes" <josfe...@redhat.com> wrote: > > > > Hi Ajil, > > > > Well few things, > > > > 1. Whenever you see a crash its better to send across the Backtrace(BT) > using gdb and attach the log files (or share it via some cloud drive) > > > > 2. About the memory leak, What kind of tools are you using for profiling > memory, valgrind ? if so please attach the valgrind reports. > >$> glusterd --xlator-option *.run-with-valgrind=yes > > > > 3. Well I am not sure if glusterd uses any of the mempools as we do in > client and brick processes, Atin can shed some light on this. > >Well In that case you can used the statedump mechanism check for > mem-leaks check the glusterfs/doc/debugging/statedump.md > GlusterD does use mempool and it has infra for capturing statedump as well. > I am aware of few bytes of memory leaks in few paths which is really not a > huge concern but it shouldn't crash. > > > > Hope this helps > > > > ~Joe > > > > > > - Original Message - > > > From: "Ajil Abraham" <ajil95.abra...@gmail.com> > > > To: "Atin Mukherjee" <atin.mukherje...@gmail.com> > > > Cc: "Gluster Devel" <gluster-devel@gluster.org> > > > Sent: Thursday, March 3, 2016 10:48:56 PM > > > Subject: Re: [Gluster-devel] glusterd crashing > > > > > > Hi Atin, > > > > > > The inputs I use are as per the requirements of a project I am working > on for > > > one of the large finance institutions in Dubai. I will try to handle the > > > input validation within my code. I uncovered some of the issues while > doing > > > a thorough testing of my code. > > > > > > I tried with 3.7.6 and also my own build from master branch. I will > check > > > with my superiors before sending you backtrace and other details. So > far, I > > > have seen memory leak in 100s of KBs. > > > > > > -Ajil > > > > > > > > > On Thu, Mar 3, 2016 at 10:17 PM, Atin Mukherjee < > atin.mukherje...@gmail.com > > > > wrote: > > > > > > > > > > > > > > > Hi Ajil, > > > > > > Its good to see that you are doing a thorough testing gluster. From > your mail > > > it looks like your automation focuses on mostly negative tests. I need > few > > > additional details to get to know whether they are known: > > > > > > 1. Version of gluster > > > 2. Backtrace of the crash along with reproducer > > > 3. Amount of memory leak in terms of bytes/KB/MB?? Have you already > > > identified them? > > > > > > -Atin > > > Sent from one plus one > > > On 03-Mar-2016 10:01 pm, "Ajil Abraham" < ajil95.abra...@gmail.com > > wrote: > > > > > > > > > > > > For my project, I am trying to do some automation using glusterd. It is > very > > > frustrating to see it crashing frequently. Looks like input validation > is > > > the culprit. I also see lot of buffer overflow and memory leak issues. > > > Making a note of these and will try to fix them. Surprised to see such > basic > > > issues still existing in Gluster. > > > > > > -Ajil > > > > > > ___ > > > Gluster-devel mailing list > > > Gluster-devel@gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > ___ > > > Gluster-devel mailing list > > > Gluster-devel@gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] glusterd crashing
Hi Ajil, Well few things, 1. Whenever you see a crash its better to send across the Backtrace(BT) using gdb and attach the log files (or share it via some cloud drive) 2. About the memory leak, What kind of tools are you using for profiling memory, valgrind ? if so please attach the valgrind reports. $> glusterd --xlator-option *.run-with-valgrind=yes 3. Well I am not sure if glusterd uses any of the mempools as we do in client and brick processes, Atin can shed some light on this. Well In that case you can used the statedump mechanism check for mem-leaks check the glusterfs/doc/debugging/statedump.md Hope this helps ~Joe - Original Message - > From: "Ajil Abraham"> To: "Atin Mukherjee" > Cc: "Gluster Devel" > Sent: Thursday, March 3, 2016 10:48:56 PM > Subject: Re: [Gluster-devel] glusterd crashing > > Hi Atin, > > The inputs I use are as per the requirements of a project I am working on for > one of the large finance institutions in Dubai. I will try to handle the > input validation within my code. I uncovered some of the issues while doing > a thorough testing of my code. > > I tried with 3.7.6 and also my own build from master branch. I will check > with my superiors before sending you backtrace and other details. So far, I > have seen memory leak in 100s of KBs. > > -Ajil > > > On Thu, Mar 3, 2016 at 10:17 PM, Atin Mukherjee < atin.mukherje...@gmail.com > > wrote: > > > > > Hi Ajil, > > Its good to see that you are doing a thorough testing gluster. From your mail > it looks like your automation focuses on mostly negative tests. I need few > additional details to get to know whether they are known: > > 1. Version of gluster > 2. Backtrace of the crash along with reproducer > 3. Amount of memory leak in terms of bytes/KB/MB?? Have you already > identified them? > > -Atin > Sent from one plus one > On 03-Mar-2016 10:01 pm, "Ajil Abraham" < ajil95.abra...@gmail.com > wrote: > > > > For my project, I am trying to do some automation using glusterd. It is very > frustrating to see it crashing frequently. Looks like input validation is > the culprit. I also see lot of buffer overflow and memory leak issues. > Making a note of these and will try to fix them. Surprised to see such basic > issues still existing in Gluster. > > -Ajil > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Change in glusterfs[master]: tests: fix bug-860663.t
Oops the authors are not actively contributing to gluster now. Sorry about that. ~Joe - Original Message - > From: "Joseph Fernandes" <josfe...@redhat.com> > To: jda...@redhat.com > Cc: "Gluster Devel" <gluster-devel@gluster.org> > Sent: Monday, February 29, 2016 3:44:57 PM > Subject: Re: [Gluster-devel] Change in glusterfs[master]: tests: fix > bug-860663.t > > Hi Jeff, > > I would suggest add the author of this script to the review panel. > > Regards, > Joe > > - Original Message - > > From: "Jeff Darcy (Code Review)" <rev...@dev.gluster.org> > > To: "Joseph Fernandes" <josfe...@redhat.com> > > Sent: Monday, February 29, 2016 3:42:45 PM > > Subject: Change in glusterfs[master]: tests: fix bug-860663.t > > > > Hello Joseph Fernandes, > > > > I'd like you to do a code review. Please visit > > > > http://review.gluster.org/13544 > > > > to review the following change. > > > > Change subject: tests: fix bug-860663.t > > .. > > > > tests: fix bug-860663.t > > > > Three changes: > > > > * Removed the second round of file creation, which wasn't really > >testing anything useful and was causing spurious failures. Under the > >conditions we've set up, the rational expectation would be for the > >file-creation helper program to succeed, but the test expected it to > >fail. > > > > * Removed Yet Another Unnecessary Sleep. > > > > * Reduced the number of files from 10K to 1K. That's more than > >sufficient to test what we're trying to test, and saves significant > >time. > > > > There's still a bit of a mystery about how this test *ever* passed. If > > I want mystery I'll read a detective novel. The more important thing is > > that the line in question was irrelevant to the purpose of the test > > (which had to do with not allowing a fix-layout while bricks were down) > > and was far too heavyweight to be included as a "while we're here might > > as well" kind of addition. > > > > Change-Id: If1c623853745ab42ce7d058d1009bbe1dcc1e985 > > Signed-off-by: Jeff Darcy <jda...@redhat.com> > > --- > > M tests/bugs/distribute/bug-860663.t > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > > > git pull ssh://git.gluster.org/glusterfs refs/changes/44/13544/1 > > -- > > To view, visit http://review.gluster.org/13544 > > To unsubscribe, visit http://review.gluster.org/settings > > > > Gerrit-MessageType: newchange > > Gerrit-Change-Id: If1c623853745ab42ce7d058d1009bbe1dcc1e985 > > Gerrit-PatchSet: 1 > > Gerrit-Project: glusterfs > > Gerrit-Branch: master > > Gerrit-Owner: Jeff Darcy <jda...@redhat.com> > > Gerrit-Reviewer: Joseph Fernandes <josfe...@redhat.com> > > Gerrit-Reviewer: N Balachandran <nbala...@redhat.com> > > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Change in glusterfs[master]: tests: fix bug-860663.t
Hi Jeff, I would suggest add the author of this script to the review panel. Regards, Joe - Original Message - > From: "Jeff Darcy (Code Review)" <rev...@dev.gluster.org> > To: "Joseph Fernandes" <josfe...@redhat.com> > Sent: Monday, February 29, 2016 3:42:45 PM > Subject: Change in glusterfs[master]: tests: fix bug-860663.t > > Hello Joseph Fernandes, > > I'd like you to do a code review. Please visit > > http://review.gluster.org/13544 > > to review the following change. > > Change subject: tests: fix bug-860663.t > .. > > tests: fix bug-860663.t > > Three changes: > > * Removed the second round of file creation, which wasn't really >testing anything useful and was causing spurious failures. Under the >conditions we've set up, the rational expectation would be for the >file-creation helper program to succeed, but the test expected it to >fail. > > * Removed Yet Another Unnecessary Sleep. > > * Reduced the number of files from 10K to 1K. That's more than >sufficient to test what we're trying to test, and saves significant >time. > > There's still a bit of a mystery about how this test *ever* passed. If > I want mystery I'll read a detective novel. The more important thing is > that the line in question was irrelevant to the purpose of the test > (which had to do with not allowing a fix-layout while bricks were down) > and was far too heavyweight to be included as a "while we're here might > as well" kind of addition. > > Change-Id: If1c623853745ab42ce7d058d1009bbe1dcc1e985 > Signed-off-by: Jeff Darcy <jda...@redhat.com> > --- > M tests/bugs/distribute/bug-860663.t > 1 file changed, 5 insertions(+), 6 deletions(-) > > > git pull ssh://git.gluster.org/glusterfs refs/changes/44/13544/1 > -- > To view, visit http://review.gluster.org/13544 > To unsubscribe, visit http://review.gluster.org/settings > > Gerrit-MessageType: newchange > Gerrit-Change-Id: If1c623853745ab42ce7d058d1009bbe1dcc1e985 > Gerrit-PatchSet: 1 > Gerrit-Project: glusterfs > Gerrit-Branch: master > Gerrit-Owner: Jeff Darcy <jda...@redhat.com> > Gerrit-Reviewer: Joseph Fernandes <josfe...@redhat.com> > Gerrit-Reviewer: N Balachandran <nbala...@redhat.com> > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Need help with bitrot
Hope this helps Courtesy : Raghvendra Talur (rta...@redhat.com) 1. Clone glusterfs repo to your laptop and get acquainted with dev workflow. https://gluster.readthedocs.org/en/latest/Developer-guide/Developers-Index/ 2. If you find using your laptop as the test machine for Gluster as too scary, here is a vagrant based mechanism to get VMs setup on your laptop easily for Gluster testing. http://comments.gmane.org/gmane.comp.file-systems.gluster.devel/13494 3.Find my Gluster Introduction blog post here in the preview link: https://6227134958232800133_bafac39c28bee4f256bbbef7510c9bb9b44fca05.blogspot.com/b/post-preview?token=s6_4MVIBAAA.zY--3ij00CkDwnitBOwnFBowEvCsKZ0o4ToQ0KYk9Po4pKujPj9ugmn-fm-XUFdLQxU50FmnCxBBr_IkSzuSlA.l_XFe1UvIEAiqkFAZZPdqQ=4168074834715190149=POST 4. Follow all the lessons here in Translator 101 series for build on your understanding of Gluster. http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/ http://hekafs.org/index.php/2011/11/translator-101-lesson-2-init-fini-and-private-context http://hekafs.org/index.php/2011/11/translator-101-lesson-3-this-time-for-real http://hekafs.org/index.php/2011/11/translator-101-lesson-4-debugging-a-translator 5. Try to fix or understand any of the bugs in this list https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW_status=ASSIGNED=Community=keywords_id=4424622=substring=GlusterFS_format=advanced=easyfix Regards, Joe - Original Message - From: "Ajil Abraham"To: "FNU Raghavendra Manjunath" Cc: "Gluster Devel" Sent: Thursday, February 25, 2016 8:58:35 PM Subject: Re: [Gluster-devel] Need help with bitrot Thanks FNU Raghavendra. Does the signing happen only when the file data changes or even when extended attribute changes? I am also trying to understand the Gluster internal data structures. Are there any materials for the same? Similarly for the translators, the way they are stacked on client & server side, how control flows between them. Can somebody please help? - Ajil On Thu, Feb 25, 2016 at 7:27 AM, FNU Raghavendra Manjunath < rab...@redhat.com > wrote: Hi Ajil, Expiry policy tells the signer (Bit-rot Daemon) to wait for a specific period of time before signing a object. Whenever a object is modified, a notification is sent to the signer by brick process (bit-rot-stub xlator sitting in the I/O path) upon getting a release (i.e. when all the fds of that object are closed). The expiry policy tells the signer to wait for some time (by default its 120 seconds) before signing that object. It is done because, suppose the signer starts signing (i.e. read the object + calculate the checksum + store the checksum) a object the object gets modified again, then a new notification has to be sent and again signer has to sign the object by calculating the checksum. Whereas if the signer waits for some time and receives a new notification on the same object when its waiting, then it can avoid signing for the first notification. Venky, do you want to add anything more? Regards, Raghavendra On Wed, Feb 24, 2016 at 12:28 AM, Ajil Abraham < ajil95.abra...@gmail.com > wrote: Hi, I am a student interested in GlusterFS. Trying to understand the design of GlusterFS. Came across the Bitrot design document in Google. There is a mention of expiry policy used to sign the files. I did not clearly understand what the expiry policy is. Can somebody please help? -Ajil ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Removing Fix layout during attach
Yep got it! A Cold tier may be running crippled but adding of a working hot tier need not bother about it. But still there is a issue, The files in the offline cold-subvol will not be heal by the DB and hence when the cold-subvol comes online the file will not be promoted or demote untill a named lookup heal happens on it. In this case should we give a warning or message in the log about it. - Original Message - From: "Nithya Balachandran" <nbala...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Mohammed Rafi K C" <rkavu...@redhat.com>, rhgs-tier...@redhat.com Sent: Tuesday, February 23, 2016 9:35:04 AM Subject: Re: Removing Fix layout during attach > Well as add brick to a normal volume do we have this constraint ? > The add brick is a different scenario - regular DHT rebalance requires that all subvols be up. The same need not necessarily be the case for tiering. > - Original Message - > From: "Nithya Balachandran" <nbala...@redhat.com> > To: "Joseph Fernandes" <josfe...@redhat.com> > Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Mohammed Rafi K C" > <rkavu...@redhat.com>, rhgs-tier...@redhat.com > Sent: Monday, February 22, 2016 4:31:01 PM > Subject: Re: Removing Fix layout during attach > > > +gluster-devel > > > > - Original Message - > > From: "Joseph Fernandes" <josfe...@redhat.com> > > To: "Mohammed Rafi K C" <rkavu...@redhat.com> > > Cc: "Nithya Balachandran" <nbala...@redhat.com>, rhgs-tier...@redhat.com > > Sent: Monday, February 22, 2016 4:21:48 PM > > Subject: Re: Removing Fix layout during attach > > > > Thanks Rafi for the follow up mail after the call. > > > > 1. Yep first we will have a background fixlayout which will serve two > > things > >a. Proactive healing of directories so that the first io need not do it. > >b. CTR Database heal using lookups. > > > > 2. The race (https://bugzilla.redhat.com/show_bug.cgi?id=1286208.) is the > > view consistency issue of gluster and will happen anyhow. > > > > Regards, > > Joe > > > Something else that can be considered - does the tier daemon really require > all subvols to be up? > > > > > > > ----- Original Message - > > From: "Mohammed Rafi K C" <rkavu...@redhat.com> > > To: "Joseph Fernandes" <josfe...@redhat.com>, "Nithya Balachandran" > > <nbala...@redhat.com> > > Cc: rhgs-tier...@redhat.com > > Sent: Monday, February 22, 2016 4:13:17 PM > > Subject: Re: Removing Fix layout during attach > > > > > > > > On 02/22/2016 12:24 PM, Joseph Fernandes wrote: > > > Hi Nithya/Rafi, > > > > > > Could you please let me know what are the implication of not having fix > > > layout during attach tier. > > > > One of the major problem is the performance drop, can be neutralized by > > having a background fix-layout. Again if we are planning to have a > > background fix-layout, though it is independent of tier functionality it > > is good to have a way to determine whether we completed a successful > > fix-layout at least once, if so we don't need to run fix-layout for > > every tier restart. > > > > Currently tier daemon will be killed if fix-layout fails, I think we can > > remove this constrain. We can abort the back ground fix-layout, but in > > my understanding we don't need to stop the tier process. > > > > > > > > I am looking into this issue now and would like to know what are prior > > > discussion that happened before. Just getting insync. > > This is the one of the open bug [1] that I'm aware of, which can hit in > > this case. But this race is independent of tier and fix-layout all > > together. So we can move forward , and this can be fixed parallel. > > > > [1] : https://bugzilla.redhat.com/show_bug.cgi?id=1286208. > > > > Regards! > > Rafi KC > > > > > > > > Regards, > > > Joe > > > > > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Removing Fix layout during attach
Well as add brick to a normal volume do we have this constraint ? - Original Message - From: "Nithya Balachandran" <nbala...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Mohammed Rafi K C" <rkavu...@redhat.com>, rhgs-tier...@redhat.com Sent: Monday, February 22, 2016 4:31:01 PM Subject: Re: Removing Fix layout during attach > +gluster-devel > > - Original Message - > From: "Joseph Fernandes" <josfe...@redhat.com> > To: "Mohammed Rafi K C" <rkavu...@redhat.com> > Cc: "Nithya Balachandran" <nbala...@redhat.com>, rhgs-tier...@redhat.com > Sent: Monday, February 22, 2016 4:21:48 PM > Subject: Re: Removing Fix layout during attach > > Thanks Rafi for the follow up mail after the call. > > 1. Yep first we will have a background fixlayout which will serve two things >a. Proactive healing of directories so that the first io need not do it. >b. CTR Database heal using lookups. > > 2. The race (https://bugzilla.redhat.com/show_bug.cgi?id=1286208.) is the > view consistency issue of gluster and will happen anyhow. > > Regards, > Joe Something else that can be considered - does the tier daemon really require all subvols to be up? > > > - Original Message - > From: "Mohammed Rafi K C" <rkavu...@redhat.com> > To: "Joseph Fernandes" <josfe...@redhat.com>, "Nithya Balachandran" > <nbala...@redhat.com> > Cc: rhgs-tier...@redhat.com > Sent: Monday, February 22, 2016 4:13:17 PM > Subject: Re: Removing Fix layout during attach > > > > On 02/22/2016 12:24 PM, Joseph Fernandes wrote: > > Hi Nithya/Rafi, > > > > Could you please let me know what are the implication of not having fix > > layout during attach tier. > > One of the major problem is the performance drop, can be neutralized by > having a background fix-layout. Again if we are planning to have a > background fix-layout, though it is independent of tier functionality it > is good to have a way to determine whether we completed a successful > fix-layout at least once, if so we don't need to run fix-layout for > every tier restart. > > Currently tier daemon will be killed if fix-layout fails, I think we can > remove this constrain. We can abort the back ground fix-layout, but in > my understanding we don't need to stop the tier process. > > > > > I am looking into this issue now and would like to know what are prior > > discussion that happened before. Just getting insync. > This is the one of the open bug [1] that I'm aware of, which can hit in > this case. But this race is independent of tier and fix-layout all > together. So we can move forward , and this can be fixed parallel. > > [1] : https://bugzilla.redhat.com/show_bug.cgi?id=1286208. > > Regards! > Rafi KC > > > > > Regards, > > Joe > > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Removing Fix layout during attach
+gluster-devel - Original Message - From: "Joseph Fernandes" <josfe...@redhat.com> To: "Mohammed Rafi K C" <rkavu...@redhat.com> Cc: "Nithya Balachandran" <nbala...@redhat.com>, rhgs-tier...@redhat.com Sent: Monday, February 22, 2016 4:21:48 PM Subject: Re: Removing Fix layout during attach Thanks Rafi for the follow up mail after the call. 1. Yep first we will have a background fixlayout which will serve two things a. Proactive healing of directories so that the first io need not do it. b. CTR Database heal using lookups. 2. The race (https://bugzilla.redhat.com/show_bug.cgi?id=1286208.) is the view consistency issue of gluster and will happen anyhow. Regards, Joe - Original Message - From: "Mohammed Rafi K C" <rkavu...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com>, "Nithya Balachandran" <nbala...@redhat.com> Cc: rhgs-tier...@redhat.com Sent: Monday, February 22, 2016 4:13:17 PM Subject: Re: Removing Fix layout during attach On 02/22/2016 12:24 PM, Joseph Fernandes wrote: > Hi Nithya/Rafi, > > Could you please let me know what are the implication of not having fix > layout during attach tier. One of the major problem is the performance drop, can be neutralized by having a background fix-layout. Again if we are planning to have a background fix-layout, though it is independent of tier functionality it is good to have a way to determine whether we completed a successful fix-layout at least once, if so we don't need to run fix-layout for every tier restart. Currently tier daemon will be killed if fix-layout fails, I think we can remove this constrain. We can abort the back ground fix-layout, but in my understanding we don't need to stop the tier process. > > I am looking into this issue now and would like to know what are prior > discussion that happened before. Just getting insync. This is the one of the open bug [1] that I'm aware of, which can hit in this case. But this race is independent of tier and fix-layout all together. So we can move forward , and this can be fixed parallel. [1] : https://bugzilla.redhat.com/show_bug.cgi?id=1286208. Regards! Rafi KC > > Regards, > Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Need info : Gluster online upgrade hooks.
Thanks guys will check it out. - Original Message - From: "Atin Mukherjee" <amukh...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com>, "Aravinda" <avish...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org> Sent: Monday, February 22, 2016 12:18:12 PM Subject: Re: [Gluster-devel] Need info : Gluster online upgrade hooks. Look for glusterd_handle_upgrade_downgrade () in glusterd-utils.c which is the entry point of handling the upgrade. As of now I don't think we call any hook script in the upgrade path. ~Atin On 02/22/2016 12:11 PM, Joseph Fernandes wrote: > So Glusterd run the hooked up scripts (asynchronously I assume), Can you > point me to the glusterd file that I need to look at, to subscribe > this service. > > Regards, > Joe > > - Original Message - > From: "Aravinda" <avish...@redhat.com> > To: "Joseph Fernandes" <josfe...@redhat.com>, "Gluster Devel" > <gluster-devel@gluster.org> > Sent: Monday, February 22, 2016 12:01:43 PM > Subject: Re: [Gluster-devel] Need info : Gluster online upgrade hooks. > > glusterd will be run with "*.upgrade=on" argument after rpm > installation. I think you can plugin your upgrade scripts with that option. > > regards > Aravinda > > On 02/22/2016 11:48 AM, Joseph Fernandes wrote: >> Hi All, >> >> Need help in the gluster upgrade scenario, i.e information on upgrade >> scripts/hooks(might be in glusterd) >> >> Tiering requirement is CTR Xlator database schema will change for >> performance again http://review.gluster.org/#/c/13248/ >> >> So older version of gluster would need to migrate the database entries to >> the new database. I can provide with the script or code to do so, but need >> info on the upgrade hooks. >> >> Regards, >> Joe >> ___ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Need info : Gluster online upgrade hooks.
So Glusterd run the hooked up scripts (asynchronously I assume), Can you point me to the glusterd file that I need to look at, to subscribe this service. Regards, Joe - Original Message - From: "Aravinda" <avish...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com>, "Gluster Devel" <gluster-devel@gluster.org> Sent: Monday, February 22, 2016 12:01:43 PM Subject: Re: [Gluster-devel] Need info : Gluster online upgrade hooks. glusterd will be run with "*.upgrade=on" argument after rpm installation. I think you can plugin your upgrade scripts with that option. regards Aravinda On 02/22/2016 11:48 AM, Joseph Fernandes wrote: > Hi All, > > Need help in the gluster upgrade scenario, i.e information on upgrade > scripts/hooks(might be in glusterd) > > Tiering requirement is CTR Xlator database schema will change for performance > again http://review.gluster.org/#/c/13248/ > > So older version of gluster would need to migrate the database entries to the > new database. I can provide with the script or code to do so, but need info > on the upgrade hooks. > > Regards, > Joe > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Need info : Gluster online upgrade hooks.
Hi All, Need help in the gluster upgrade scenario, i.e information on upgrade scripts/hooks(might be in glusterd) Tiering requirement is CTR Xlator database schema will change for performance again http://review.gluster.org/#/c/13248/ So older version of gluster would need to migrate the database entries to the new database. I can provide with the script or code to do so, but need info on the upgrade hooks. Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] NSR: Suggestions for a new name
My vote for Consensus Based Replication (CBR) Also because HONDA CBR ;) http://www.honda2wheelersindia.com/cbr1000rr/ ~Joe - Original Message - From: "Avra Sengupta"To: gluster-devel@gluster.org Sent: Friday, February 12, 2016 2:05:41 PM Subject: Re: [Gluster-devel] NSR: Suggestions for a new name Well, we got quite a few suggestions. So I went ahead and created a doodle poll. Please find the link below for the poll, and vote for the name you think will be the best. http://doodle.com/poll/h7gfdhswrbsxxiaa Regards, Avra On 01/21/2016 12:21 PM, Avra Sengupta wrote: > On 01/21/2016 12:20 PM, Atin Mukherjee wrote: >> Etherpad link please? > Oops My Bad. Here it is > https://public.pad.fsfe.org/p/NSR_name_suggestions >> >> On 01/21/2016 12:19 PM, Avra Sengupta wrote: >>> Thanks for the suggestion Pranith. To make things interesting, we have >>> created an etherpad where people can put their suggestions. Somewhere >>> around mid of feb, we will look at all the suggestions we have got, >>> have >>> a community vote and zero in on one. The suggester of the winning name >>> gets a goody. >>> >>> Feel free to add more than one entry. >>> >>> Regards, >>> Avra >>> >>> On 01/21/2016 10:08 AM, Pranith Kumar Karampuri wrote: On 01/19/2016 08:00 PM, Avra Sengupta wrote: > Hi, > > The leader election based replication has been called NSR or "New > Style Replication" for a while now. We would like to have a new name > for the same that's less generic. It can be something like "Leader > Driven Replication" or something more specific that would make sense > a few years down the line too. > > We would love to hear more suggestions from the community. Thanks If I had a chance to name AFR (Automatic File Replication) I would have named it Automatic Data replication. Feel free to use it if you like it. Pranith > Regards, > Avra > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >>> ___ >>> Gluster-devel mailing list >>> Gluster-devel@gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-devel > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] distributed files/directories and [cm]time updates
Hi Xavi, Answer inline: - Original Message - From: "Xavier Hernandez" <xhernan...@datalab.es> To: "Joseph Fernandes" <josfe...@redhat.com> Cc: "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Gluster Devel" <gluster-devel@gluster.org> Sent: Tuesday, January 26, 2016 2:09:43 PM Subject: Re: [Gluster-devel] distributed files/directories and [cm]time updates Hi Joseph, On 26/01/16 09:07, Joseph Fernandes wrote: > Answer inline: > > > - Original Message - > From: "Xavier Hernandez" <xhernan...@datalab.es> > To: "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Gluster Devel" > <gluster-devel@gluster.org> > Sent: Tuesday, January 26, 2016 1:21:37 PM > Subject: Re: [Gluster-devel] distributed files/directories and [cm]time > updates > > Hi Pranith, > > On 26/01/16 03:47, Pranith Kumar Karampuri wrote: >> hi, >> Traditionally gluster has been using ctime/mtime of the >> files/dirs on the bricks as stat output. Problem we are seeing with this >> approach is that, software which depends on it gets confused when there >> are differences in these times. Tar especially gives "file changed as we >> read it" whenever it detects ctime differences when stat is served from >> different bricks. The way we have been trying to solve it is to serve >> the stat structures from same brick in afr, max-time in dht. But it >> doesn't avoid the problem completely. Because there is no way to change >> ctime at the moment(lutimes() only allows mtime, atime), there is little >> we can do to make sure ctimes match after self-heals/xattr >> updates/rebalance. I am wondering if anyone of you solved these problems >> before, if yes how did you go about doing it? It seems like applications >> which depend on this for backups get confused the same way. The only way >> out I see it is to bring ctime to an xattr, but that will need more iops >> and gluster has to keep updating it on quite a few fops. > > I did think about this when I was writing ec at the beginning. The idea > was that the point in time at which each fop is executed were controlled > by the client by adding an special xattr to each regular fop. Of course > this would require support inside the storage/posix xlator. At that > time, adding the needed support to other xlators seemed too complex for > me, so I decided to do something similar to afr. > > Anyway, the idea was like this: for example, when a write fop needs to > be sent, dht/afr/ec sets the current time in a special xattr, for > example 'glusterfs.time'. It can be done in a way that if the time is > already set by a higher xlator, it's not modified. This way DHT could > set the time in fops involving multiple afr subvolumes. For other fops, > would be afr who sets the time. It could also be set directly by the top > most xlator (fuse), but that time could be incorrect because lower > xlators could delay the fop execution and reorder it. This would need > more thinking. > > That xattr will be received by storage/posix. This xlator will determine > what times need to be modified and will change them. In the case of a > write, it can decide to modify mtime and, maybe, atime. For a mkdir or > create, it will set the times of the new file/directory and also the > mtime of the parent directory. It depends on the specific fop being > processed. > > mtime, atime and ctime (or even others) could be saved in a special > posix xattr instead of relying on the file system attributes that cannot > be modified (at least for ctime). > > This solution doesn't require extra fops, So it seems quite clean to me. > The additional I/O needed in posix could be minimized by implementing a > metadata cache in storage/posix that would read all metadata on lookup > and update it on disk only at regular intervals and/or on invalidation. > All fops would read/write into the cache. This would even reduce the > number of I/O we are currently doing for each fop. > >>>>>>>>>> JOE: the idea of metadata cache is cool for read work loads, but for >>>>>>>>>> writes we > would end up doing double writes to the disk. i.e 1 for the actual write or 1 > to update the setxattr. > IMHO we cannot have it in a write back cache (periodic flush to disk) and > ctime/mtime/atime data loss > or inconsistency will be a problem. Your thoughts? If we want to have all in physical storage at all times, gluster will be slow. We only need to be posix compliant, and posix allows some degree of "inconsistency" here. i.e. we are not forced to write to physical storage until the user appli
Re: [Gluster-devel] distributed files/directories and [cm]time updates
Answer inline: - Original Message - From: "Xavier Hernandez"To: "Pranith Kumar Karampuri" , "Gluster Devel" Sent: Tuesday, January 26, 2016 1:21:37 PM Subject: Re: [Gluster-devel] distributed files/directories and [cm]time updates Hi Pranith, On 26/01/16 03:47, Pranith Kumar Karampuri wrote: > hi, >Traditionally gluster has been using ctime/mtime of the > files/dirs on the bricks as stat output. Problem we are seeing with this > approach is that, software which depends on it gets confused when there > are differences in these times. Tar especially gives "file changed as we > read it" whenever it detects ctime differences when stat is served from > different bricks. The way we have been trying to solve it is to serve > the stat structures from same brick in afr, max-time in dht. But it > doesn't avoid the problem completely. Because there is no way to change > ctime at the moment(lutimes() only allows mtime, atime), there is little > we can do to make sure ctimes match after self-heals/xattr > updates/rebalance. I am wondering if anyone of you solved these problems > before, if yes how did you go about doing it? It seems like applications > which depend on this for backups get confused the same way. The only way > out I see it is to bring ctime to an xattr, but that will need more iops > and gluster has to keep updating it on quite a few fops. I did think about this when I was writing ec at the beginning. The idea was that the point in time at which each fop is executed were controlled by the client by adding an special xattr to each regular fop. Of course this would require support inside the storage/posix xlator. At that time, adding the needed support to other xlators seemed too complex for me, so I decided to do something similar to afr. Anyway, the idea was like this: for example, when a write fop needs to be sent, dht/afr/ec sets the current time in a special xattr, for example 'glusterfs.time'. It can be done in a way that if the time is already set by a higher xlator, it's not modified. This way DHT could set the time in fops involving multiple afr subvolumes. For other fops, would be afr who sets the time. It could also be set directly by the top most xlator (fuse), but that time could be incorrect because lower xlators could delay the fop execution and reorder it. This would need more thinking. That xattr will be received by storage/posix. This xlator will determine what times need to be modified and will change them. In the case of a write, it can decide to modify mtime and, maybe, atime. For a mkdir or create, it will set the times of the new file/directory and also the mtime of the parent directory. It depends on the specific fop being processed. mtime, atime and ctime (or even others) could be saved in a special posix xattr instead of relying on the file system attributes that cannot be modified (at least for ctime). This solution doesn't require extra fops, So it seems quite clean to me. The additional I/O needed in posix could be minimized by implementing a metadata cache in storage/posix that would read all metadata on lookup and update it on disk only at regular intervals and/or on invalidation. All fops would read/write into the cache. This would even reduce the number of I/O we are currently doing for each fop. > JOE: the idea of metadata cache is cool for read work loads, but for > writes we would end up doing double writes to the disk. i.e 1 for the actual write or 1 to update the setxattr. IMHO we cannot have it in a write back cache (periodic flush to disk) and ctime/mtime/atime data loss or inconsistency will be a problem. Your thoughts? Xavi ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] distributed files/directories and [cm]time updates
Answers inline: - Original Message - From: "Joe Julian" <j...@julianfamily.org> To: gluster-devel@gluster.org Sent: Tuesday, January 26, 2016 1:45:36 PM Subject: Re: [Gluster-devel] distributed files/directories and [cm]time updates If the time is set on a file by the client, this increases the critical complexity to include the clients whereas before it was only critical to have the servers time synced, now the clients should be as well. >>>>> JOE: If the time on the file is set from client than it becomes difficult >>>>> in the compliance case (WORM-Retention) where we refer to the server time how long we retain a file. This feature is not yet in Gluster, but we are looking into it. Just spitballing here, but what if the time was converted at the posix layer as a difference between the current time and the file time and converted back somewhere in the client graph? Each server's file time would differ by the same amount to its current time [1] so it should be a consistent value between servers. [1] depending on drift, but if the admin can't manage clocks, there's not much gluster could or should do about that. On 01/26/2016 12:07 AM, Joseph Fernandes wrote: > Answer inline: > > > - Original Message - > From: "Xavier Hernandez" <xhernan...@datalab.es> > To: "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Gluster Devel" > <gluster-devel@gluster.org> > Sent: Tuesday, January 26, 2016 1:21:37 PM > Subject: Re: [Gluster-devel] distributed files/directories and [cm]time > updates > > Hi Pranith, > > On 26/01/16 03:47, Pranith Kumar Karampuri wrote: >> hi, >> Traditionally gluster has been using ctime/mtime of the >> files/dirs on the bricks as stat output. Problem we are seeing with this >> approach is that, software which depends on it gets confused when there >> are differences in these times. Tar especially gives "file changed as we >> read it" whenever it detects ctime differences when stat is served from >> different bricks. The way we have been trying to solve it is to serve >> the stat structures from same brick in afr, max-time in dht. But it >> doesn't avoid the problem completely. Because there is no way to change >> ctime at the moment(lutimes() only allows mtime, atime), there is little >> we can do to make sure ctimes match after self-heals/xattr >> updates/rebalance. I am wondering if anyone of you solved these problems >> before, if yes how did you go about doing it? It seems like applications >> which depend on this for backups get confused the same way. The only way >> out I see it is to bring ctime to an xattr, but that will need more iops >> and gluster has to keep updating it on quite a few fops. > I did think about this when I was writing ec at the beginning. The idea > was that the point in time at which each fop is executed were controlled > by the client by adding an special xattr to each regular fop. Of course > this would require support inside the storage/posix xlator. At that > time, adding the needed support to other xlators seemed too complex for > me, so I decided to do something similar to afr. > > Anyway, the idea was like this: for example, when a write fop needs to > be sent, dht/afr/ec sets the current time in a special xattr, for > example 'glusterfs.time'. It can be done in a way that if the time is > already set by a higher xlator, it's not modified. This way DHT could > set the time in fops involving multiple afr subvolumes. For other fops, > would be afr who sets the time. It could also be set directly by the top > most xlator (fuse), but that time could be incorrect because lower > xlators could delay the fop execution and reorder it. This would need > more thinking. > > That xattr will be received by storage/posix. This xlator will determine > what times need to be modified and will change them. In the case of a > write, it can decide to modify mtime and, maybe, atime. For a mkdir or > create, it will set the times of the new file/directory and also the > mtime of the parent directory. It depends on the specific fop being > processed. > > mtime, atime and ctime (or even others) could be saved in a special > posix xattr instead of relying on the file system attributes that cannot > be modified (at least for ctime). > > This solution doesn't require extra fops, So it seems quite clean to me. > The additional I/O needed in posix could be minimized by implementing a > metadata cache in storage/posix that would read all metadata on lookup > and update it on disk only at regular intervals and/or on invalidation. > All fops would read/write into the cache. This would even reduce the > nu
Re: [Gluster-devel] Tips and Tricks for Gluster Developer
- Original Message - From: "Jeff Darcy"To: "Richard Wareing" Cc: "Gluster Devel" Sent: Monday, January 25, 2016 7:27:20 PM Subject: Re: [Gluster-devel] Tips and Tricks for Gluster Developer Oh boy, here we go. ;) I second Richard's suggestion to use cscope or some equivalent. It's a good idea in general, but especially with a codebase as large and complex as Gluster's. I literally wouldn't be able to do my job without it. I also have a set of bash/zsh aliases that will regenerate the cscope database after any git action, so I rarely have to do it myself. JOE : Well cscope and vim is good enough but a good IDE (with its own search and cscope integrated) will also help. I have been using codelite (http://codelite.org/) for over 2 years now and it rocks! Another secondary tip is that in many cases anything you see in the code as "xyz_t" is actually "struct _xyz" so you can save a bit of time (in vim) with ":ta _xyz" instead of going through the meaningless typedef. Unfortunately we're not as consistent as we should be about this convention, but it mostly works. Some day I'll figure out the vim macro syntax enough to create a proper macro and binding for this shortcut. I should probably write a whole new blog post about gdb stuff. Here's one I wrote a while ago: http://pl.atyp.us/hekafs.org/index.php/2013/02/gdb-macros-for-glusterfs/ There's a lot more that could be done in this area. For example, adding loc_t or inode_t or fd_t would all be good exercises. On a more controversial note, I am opposed to the practice of doing "make install" on anything other than a transient VM/container. I've seen too many patches that were broken because they relied on "leftovers" in someone's source directory or elsewhere on the system from previous installs. On my test systems, I always build and install actual RPMs, to make sure new files are properly incorporated in to the configure/rpm system. One of these days I'll set it up so the test system even does a "git clone" (instead of rsync) from my real source tree to catch un-checked-in files as well. I'll probably think of more later, and will update here as I do. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Reverse brick order in tier volume- Why?
Suggestion: Might be in the implementation could be better to put comment in the new arbiter code. : why its implemented that way. - Original Message - From: "Pranith Kumar Karampuri" <pkara...@redhat.com> To: "Dan Lambright" <dlamb...@redhat.com> Cc: "Ravishankar N" <ravishan...@redhat.com>, "Gluster Devel" <gluster-devel@gluster.org>, "Joseph Fernandes" <josfe...@redhat.com>, "Nithya Balachandran" <nbala...@redhat.com>, "Mohammed Rafi K C" <rkavu...@redhat.com> Sent: Saturday, January 23, 2016 10:25:15 AM Subject: Re: [Gluster-devel] Reverse brick order in tier volume- Why? On 01/23/2016 10:02 AM, Dan Lambright wrote: > > - Original Message - >> From: "Pranith Kumar Karampuri" <pkara...@redhat.com> >> To: "Ravishankar N" <ravishan...@redhat.com>, "Gluster Devel" >> <gluster-devel@gluster.org>, "Dan Lambright" >> <dlamb...@redhat.com>, "Joseph Fernandes" <josfe...@redhat.com>, "Nithya >> Balachandran" <nbala...@redhat.com>, >> "Mohammed Rafi K C" <rkavu...@redhat.com> >> Sent: Friday, January 22, 2016 10:48:15 PM >> Subject: Re: [Gluster-devel] Reverse brick order in tier volume- Why? >> >> >> >> On 01/22/2016 03:48 PM, Ravishankar N wrote: >>> On 01/19/2016 06:44 PM, Ravishankar N wrote: >>>> 1) Is there is a compelling reason as to why the bricks of hot-tier >>>> are in the reverse order ? >>>> 2) If there isn't one, should we spend time to fix it so that the >>>> bricks appear in the order in which they were given at the time of >>>> volume creaction/ attach-tier *OR* just continue with the way things >>>> are currently because it is not that much of an issue? >>> Dan / Joseph - any pointers? > This order was an artifact of how the volume is created using legacy code and > data structures in glusterd-volgen.c. Two volume graphs are built (the hot > and the cold). The two graphs are built and combined in a single list. As far > as I know, nobody has run into trouble with this. Refactoring the code would > be fine to ease maintainability. Cool, the reason we ask is, that in arbiter volumes, 3rd brick is going to be the arbiter. If the bricks are in reverse order, it will lead to confusion. We will change it with our implementation of attach-tier for replica+arbiter bricks. Pranith > > >> +Nitya, Rafi as well. >>> -Ravi >>> >>> >>> ___ >>> Gluster-devel mailing list >>> Gluster-devel@gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-devel >> ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] GlusterFS User and Group Quotas
Answer inline - Original Message - From: "Vijaikumar Mallikarjuna"To: "Gluster Devel" Sent: Tuesday, December 8, 2015 3:02:55 PM Subject: [Gluster-devel] GlusterFS User and Group Quotas Hi All, Below is the design for ' GlusterFS User and Group Quotas', please provide your feedback on the same. Developers: Vijaikumar.M and Manikandan.S Introduction: User and Group quotas is to limit the amount of disk space for a specified user/group ID. This documents provides some details about how the accounting (marker xlator) can be done for user and group quotas Design: We have three different approaches, each has pros and cons Approach-1) T1 - For each file/dir 'file_x', create a contribution extended attribute say 'trusted.glusterfs.quota.-contri' T2 - In a lookup/write operation read the actual size from the stat-buf, add the delta size to the contribution xattr T3 - Create a file .glusterfs/quota/users/. Update size extended attribute say 'trusted.glusterfs.quota.size' by adding the delta size calculated in T2 Same for group quotas a size xattr is updated under .glusterfs/quota/groups/. cons: If the brick crashes after executing T2 and before T3. Now accounting information is in-correct. To recover and correct the accounting information, entire file-systems needs to be crawled to fix the trusted.glusterfs.quota.size value by summing up the contribution of all files with UID. But is a slow process. Approach-2) T1 - For each file/dir 'file_x', create a contribution extended attribute say 'trusted.glusterfs.quota.-contri' T2 - create a directory '.glusterfs/quota/users/' create a hardlink for file file_x under this directories T3 - In a lookup/write operation, set dirty flag 'trusted.glusterfs.quota.dirty' for directory '.glusterfs/quota/users/' T4 - Read the actual size of a file from the stat-buf, add the delta size to the contribution xattr T5 - update size extended attribute say for directory '.glusterfs/quota/users/' T6 - unset the dirty flag Same for group quotas a size xattr is updated under .glusterfs/quota/groups/. Problem of approach 1 of crawling entire brick is solved by only crawling the directory which is set dirty. cons: Need to make sure that the hard-link for a file is consistent when having another hardlinks under .glusterfs/quota/users/ and .glusterfs/quota/groups/ Approach-3) T1 - For each file/dir 'file_x', update a contribution entry in the SQL-LITE DB (Create a DB file under .glusterfs/quota/ ) T2 - In a lookup/write operation read the actual size from the statbuf, add the update the size in the USER-QUOTA schema in the DB T3 - In a lookup/write operation, set dirty flag 'trusted.glusterfs.quota.dirty' for directory '.glusterfs/quota/users/' Atomicity problem found in approach 1 and 2 is solved by using DB transactions. Note: need to test the consistency of the SQL-LITE DB. We feel approach-3 is more simpler and efficient way of implementing user/group quotas. JOE: 6 Points if you are planning to use sqlite 1) Use Libgfdb to access (write/read) the DB as it gives you flexibility to change type of database/datastore 2) Place the updates in a SQL Transaction BEGIN update; update; END 3) As you are not looking at durability, but more concerned with Atomicity use a big sqlite cache so that performance, refer CTR/Libgfdb for settings 4) Use the default journaling mode in sqlite, as you are doing async writes and need good reading performance. 5) Use a separate db file, and dont use tiering db file as its configured for write performance. 6) Use "INSERT IF NOT EXISTS ELSE UPDATE" in the write path for crash protection as you will be using a huge sqlite cache, but beware though this is convenient, its not a performant way to achieve eventual consistency. CTR lookup heal kind of approach will be good. If you are planning on a POC I can help :) Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] DHT2 Status update
Hi Shyam, Could you please share the presentation Regards, Joe - Original Message - From: "Shyam"To: "Gluster Devel" , "Venky Shankar" , "Kotresh Hiremath Ravishankar" Sent: Wednesday, November 4, 2015 7:57:28 AM Subject: [Gluster-devel] DHT2 Status update Hi, Coming Thursday, i.e Nov-05-2015, @ 6:30 AM Eastern we are having a short call to present DHT2 status (ics attached) To join the meeting on a computer or mobile phone: https://bluejeans.com/371209168?src=calendarLink=onzgc3thmfxgcqdsmvsgqylufzrw63i= Agenda: - Current status update - Quick demo of some FOPs working (lookup, create) - A quick tour on how things look at the backend - Targets for next milestone We will record the meeting, so in case you are interested but are unable to make it, a recording will be posted. We will also post a mail update on the status, so finer details should be present in text form anyway. Thanks, Shyam ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] FOP ratelimit?
Something ceph is working on "dmclock" http://tracker.ceph.com/projects/ceph/wiki/Rados_qos might be we can talk to them ? ~Joe - Original Message - From: "Jeff Darcy" <jda...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org>, "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Venky Shankar" <vshan...@redhat.com>, "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Shyamsundar Ranganathan" <srang...@redhat.com> Sent: Thursday, September 10, 2015 6:57:51 PM Subject: Re: [Gluster-devel] FOP ratelimit? > Have we given thought about other IO scheduling algorithms like mclock > algorithm [1], used by vmware for their QOS solution. > Plus another point to keep in mind here is the distributed nature of the > solution. Its easier to think of a brick > controlling the throughput for a client or a tenant. But how would this work > in collaboration and scale with all the > bricks together, what I am talking about is Distributed QOS. At the packet level, this is a core problem that SDN has to solve. When we're running in an SDN environment, we should just hand off responsibility for QoS to them. Otherwise, we should probably steal their algorithms. ;) I believe there are some experts elsewhere at Red Hat whose brains we can and should pick. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] FOP ratelimit?
Hi Guys, Have we given thought about other IO scheduling algorithms like mclock algorithm [1], used by vmware for their QOS solution. Plus another point to keep in mind here is the distributed nature of the solution. Its easier to think of a brick controlling the throughput for a client or a tenant. But how would this work in collaboration and scale with all the bricks together, what I am talking about is Distributed QOS. Regards, Joe [1] http://www.gluster.org/community/documentation/index.php/File:Qos.odp - Original Message - From: "Venky Shankar"To: "Raghavendra Gowdappa" Cc: "Gluster Devel" Sent: Thursday, September 10, 2015 12:16:41 PM Subject: Re: [Gluster-devel] FOP ratelimit? On Thu, Sep 3, 2015 at 11:36 AM, Raghavendra Gowdappa wrote: > > > - Original Message - >> From: "Emmanuel Dreyfus" >> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" >> >> Cc: gluster-devel@gluster.org >> Sent: Wednesday, September 2, 2015 8:12:37 PM >> Subject: Re: [Gluster-devel] FOP ratelimit? >> >> Raghavendra Gowdappa wrote: >> >> > Its helpful if you can give some pointers on what parameters (like >> > latency, throughput etc) you want us to consider for QoS. >> >> Full blown QoS would be nice, but a first line of defense against >> resource hogs seems just badly required. >> >> A bare minimum could be to process client's FOP in a round robin >> fashion. That way even if one client sends a lot of FOPs, there is >> always some window for others to slip in. >> >> Any opinion? > > As of now we depend on epoll/poll events informing servers about incoming > messages. All sockets are put in the same event-pool represented by a single > poll-control fd. So, the order of our processing of msgs from various clients > really depends on how epoll/poll picks events across multiple sockets. Do > poll/epoll have any sort of scheduling? or is it random? Any pointers on this > are appreciated. I haven't come across any kind of scheduling for picking events for sockets. Routers use synthetic throttling for traffic shaping. Most commonly used technique is by using TBF (token bucket filter) to "induce" latency for outbound traffic. Lustre had some work[1] done for QoS along the lines of TBF. HTH. [1]: http://cdn.opensfs.org/wp-content/uploads/2014/10/7-DDN_LiXi_lustre_QoS.pdf > >> >> -- >> Emmanuel Dreyfus >> http://hcpnet.free.fr/pubz >> m...@netbsd.org >> > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] FOP ratelimit?
Hi Guys, Have we given thought about other IO scheduling algorithms like mclock algorithm [1], used by vmware for their QOS solution. Plus another point to keep in mind here is the distributed nature of the solution. Its easier to think of a brick controlling the throughput for a client or a tenant. But how would this work in collaboration and scale with all the bricks together, what I am talking about is Distributed QOS. Regards, Joe [1] http://www.gluster.org/community/documentation/index.php/File:Qos.odp - Original Message - From: "Venky Shankar"To: "Raghavendra Gowdappa" Cc: "Gluster Devel" Sent: Thursday, September 10, 2015 12:16:41 PM Subject: Re: [Gluster-devel] FOP ratelimit? On Thu, Sep 3, 2015 at 11:36 AM, Raghavendra Gowdappa wrote: > > > - Original Message - >> From: "Emmanuel Dreyfus" >> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" >> >> Cc: gluster-devel@gluster.org >> Sent: Wednesday, September 2, 2015 8:12:37 PM >> Subject: Re: [Gluster-devel] FOP ratelimit? >> >> Raghavendra Gowdappa wrote: >> >> > Its helpful if you can give some pointers on what parameters (like >> > latency, throughput etc) you want us to consider for QoS. >> >> Full blown QoS would be nice, but a first line of defense against >> resource hogs seems just badly required. >> >> A bare minimum could be to process client's FOP in a round robin >> fashion. That way even if one client sends a lot of FOPs, there is >> always some window for others to slip in. >> >> Any opinion? > > As of now we depend on epoll/poll events informing servers about incoming > messages. All sockets are put in the same event-pool represented by a single > poll-control fd. So, the order of our processing of msgs from various clients > really depends on how epoll/poll picks events across multiple sockets. Do > poll/epoll have any sort of scheduling? or is it random? Any pointers on this > are appreciated. I haven't come across any kind of scheduling for picking events for sockets. Routers use synthetic throttling for traffic shaping. Most commonly used technique is by using TBF (token bucket filter) to "induce" latency for outbound traffic. Lustre had some work[1] done for QoS along the lines of TBF. HTH. [1]: http://cdn.opensfs.org/wp-content/uploads/2014/10/7-DDN_LiXi_lustre_QoS.pdf > >> >> -- >> Emmanuel Dreyfus >> http://hcpnet.free.fr/pubz >> m...@netbsd.org >> > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Snapshot test Spurious Failure
Thanks Avra! will do the appropriate change. - Original Message - From: "Avra Sengupta" <aseng...@redhat.com> To: "Joseph Fernandes" <josfe...@redhat.com> Cc: "Gluster Devel" <gluster-devel@gluster.org> Sent: Friday, September 4, 2015 5:45:19 PM Subject: Re: Snapshot test Spurious Failure It's not a spurious failure. In the patch http://review.gluster.org/#/c/12031/3, you are using list_for_each_entry in clear_bricklist(), and deleting an item from the list. That is not a right practice. Instead you should use list_for_each_entry_safe. Regards, Avra On 09/04/2015 11:50 AM, Avra Sengupta wrote: > Hi, > > I am having a look at the core. Will update shortly. > > Regards, > Avra > > On 09/04/2015 11:46 AM, Joseph Fernandes wrote: >> >> ./tests/bugs/snapshot/bug-1227646.t >> https://build.gluster.org/job/rackspace-regression-2GB-triggered/14021/consoleFull >> >> >> > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Snapshot test Spurious Failure
./tests/bugs/snapshot/bug-1227646.t https://build.gluster.org/job/rackspace-regression-2GB-triggered/14021/consoleFull ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious failure with test-case ./tests/basic/tier/tier.t
Will have a look - Original Message - From: Krutika Dhananjay kdhan...@redhat.com To: Dan Lambright dlamb...@redhat.com Cc: gluster-devel@gluster.org Sent: Thursday, July 23, 2015 10:59:16 AM Subject: Re: [Gluster-devel] spurious failure withtest-case ./tests/basic/tier/tier.t This test failed twice on my patch even after retrigger: https://build.gluster.org/job/rackspace-regression-2GB-triggered/12726/consoleFull https://build.gluster.org/job/rackspace-regression-2GB-triggered/12739/consoleFull Is this a new issue or the one that was originally reported? -Krutika From: Dan Lambright dlamb...@redhat.com To: Raghavendra Bhat rab...@redhat.com Cc: gluster-devel@gluster.org Sent: Friday, June 26, 2015 6:07:15 PM Subject: Re: [Gluster-devel] spurious failure with test-case ./tests/basic/tier/tier.t - Original Message - From: Raghavendra Bhat rab...@redhat.com To: gluster-devel@gluster.org Sent: Friday, June 26, 2015 6:37:37 AM Subject: Re: [Gluster-devel] spurious failure with test-case ./tests/basic/tier/tier.t On 06/26/2015 04:00 PM, Ravishankar N wrote: On 06/26/2015 03:57 PM, Vijaikumar M wrote: Hi Upstream regression failure with test-case ./tests/basic/tier/tier.t My patch# 11315 regression failed twice with test-case./tests/basic/tier/tier.t. Anyone seeing this issue with other patches? Yes, one of my patches failed today too: http://build.gluster.org/job/rackspace-regression-2GB-triggered/11461/consoleFull Will take a look. Thanks. -Ravi Even I had faced failure in tier.t couple of times. Regards, Raghavendra Bhat http://build.gluster.org/job/rackspace-regression-2GB-triggered/11396/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/11456/consoleFull Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] 3.7 spurious failures
Hi All, These are some of the recent hit spurious failures on 3.7 branch http://build.gluster.org/job/rackspace-regression-2GB-triggered/12356/consoleFull ./tests/bugs/snapshot/bug-1109889.t is blocking http://review.gluster.org/11649 merge http://build.gluster.org/job/rackspace-regression-2GB-triggered/12357/consoleFull ./tests/bugs/fuse/bug-1126048.t is blocking http://review.gluster.org/11608 merge Net bsd: http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/8257/consoleFull ./tests/basic/quota-nfs.t is blocking http://review.gluster.org/11649 merge. Appropriate owners please take a look. Thanks Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Spurious failure in 3.7.2: ./tests/bugs/quota/afr-quota-xattr-mdata-heal.t
http://build.gluster.org/job/rackspace-regression-2GB-triggered/12204/consoleFull ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Failure in tests/basic/tier/bug-1214222-directories_miising_after_attach_tier.t
Fernandes josfe...@redhat.com Sent: Thursday, July 2, 2015 6:16:44 PM Subject: Re: [Gluster-devel] Failure in tests/basic/tier/bug-1214222-directories_miising_after_attach_tier.t Thanks Dan!. Pranith On 07/02/2015 06:14 PM, Dan Lambright wrote: I'll check on this. - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Gluster Devel gluster-devel@gluster.org, Joseph Fernandes josfe...@redhat.com Sent: Thursday, July 2, 2015 5:40:34 AM Subject: [Gluster-devel] Failure in tests/basic/tier/bug-1214222-directories_miising_after_attach_tier.t hi Joseph, Could you take a look at http://build.gluster.org/job/rackspace-regression-2GB-triggered/11842/consoleFull Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ---BeginMessage--- Hi All, TEST 4-5 are failing i.e the following TEST $CLI volume start $V0 TEST $CLI volume attach-tier $V0 replica 2 $H0:$B0/${V0}$CACHE_BRICK_FIRST $H0:$B0/${V0}$CACHE_BRICK_LAST Glusterd Logs say: [2015-07-01 07:33:25.053412] I [rpc-clnt.c:965:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2015-07-01 07:33:25.053851] [run.c:190:runner_log] (-- /build/install/lib/libglusterfs.so.0(_gf_log_callingfn+0x240)[0x7fe8349bfb82] (-- /build/install/lib/libglusterfs.so.0(runner_log+0x192)[0x7fe834a29426] (-- /build/install/lib/glusterfs/3.8dev/xlator/mgmt/glusterd.so(glusterd_volume_start_glusterfs+0xae7)[0x7fe829e475d7] (-- /build/install/lib/glusterfs/3.8dev/xlator/mgmt/glusterd.so(glusterd_brick_start+0x151)[0x7fe829e514e3] (-- /build/install/lib/glusterfs/3.8dev/xlator/mgmt/glusterd.so(glusterd_start_volume+0xba)[0x7fe829ebd534] ) 0-: Starting GlusterFS: /build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy.slave26.cloud.gluster.org.d-backends-patchy3 -p /var/lib/glusterd/vols/patchy/run/slave26.cloud.gluster.org-d-backends-patchy3.pid -S /var/run/gluster/e511d04af0bd91bfc3b030969b789d95.socket --brick-name /d/backends/patchy3 -l /var/log/glusterfs/bricks/d-backends-patchy3.log --xlator-option *-posix.glusterd-uuid=aff38c34-7744-4cc0-9aa4-a9fab5a71b2f --brick-port 49172 --xlator-option patchy-server.listen-port=49172 [2015-07-01 07:33:25.070284] I [MSGID: 106144] [glusterd-pmap.c:269:pmap_registry_remove] 0-pmap: removing brick (null) on port 49172 [2015-07-01 07:33:25.071022] E [MSGID: 106005] [glusterd-utils.c:4448:glusterd_brick_start] 0-management: Unable to start brick slave26.cloud.gluster.org:/d/backends/patchy3 [2015-07-01 07:33:25.071053] E [MSGID: 106123] [glusterd-syncop.c:1416:gd_commit_op_phase] 0-management: Commit of operation 'Volume Start' failed on localhost The volume is 2x2 : LAST_BRICK=3 TEST $CLI volume create $V0 replica 2 $H0:$B0/${V0}{0..$LAST_BRICK} When looked into the 3 bricks are fine but when looked at the 4th brick log: [2015-07-01 07:33:25.056463] I [MSGID: 100030] [glusterfsd.c:2296:main] 0-/build/install/sbin/glusterfsd: Started running /build/install/sbin/glusterfsd version 3.8dev (args: /build/install/sbin/glusterfsd -s slave26.cloud.gluster.org --volfile-id patchy.slave26.cloud.gluster.org.d-backends-patchy3 -p /var/lib/glusterd/vols/patchy/run/slave26.cloud.gluster.org-d-backends-patchy3.pid -S /var/run/gluster/e511d04af0bd91bfc3b030969b789d95.socket --brick-name /d/backends/patchy3 -l /var/log/glusterfs/bricks/d-backends-patchy3.log --xlator-option *-posix.glusterd-uuid=aff38c34-7744-4cc0-9aa4-a9fab5a71b2f --brick-port 49172 --xlator-option patchy-server.listen-port=49172) [2015-07-01 07:33:25.064879] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-07-01 07:33:25.068992] I [MSGID: 101173] [graph.c:268:gf_add_cmdline_options] 0-patchy-server: adding option 'listen-port' for volume 'patchy-server' with value '49172' [2015-07-01 07:33:25.069034] I [MSGID: 101173] [graph.c:268:gf_add_cmdline_options] 0-patchy-posix: adding option 'glusterd-uuid' for volume 'patchy-posix' with value 'aff38c34-7744-4cc0-9aa4-a9fab5a71b2f' [2015-07-01 07:33:25.069313] I [MSGID: 115034] [server.c:392:_check_for_auth_option] 0-/d/backends/patchy3: skip format check for non-addr auth option auth.login./d/backends/patchy3.allow [2015-07-01 07:33:25.069316] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2015-07-01 07:33:25.069330] I [MSGID: 115034] [server.c:392:_check_for_auth_option] 0-/d/backends/patchy3: skip format check for non-addr auth option auth.login.18b50c0d-38fb-4b49-bb5e-b203f4217223.password [2015-07-01 07:33:25.069580] I [rpcsvc.c:2210:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2015-07-01 07:33:25.069647] W [MSGID: 101002] [options.c:952:xl_opt_validate] 0-patchy-server: option 'listen-port' is deprecated
Re: [Gluster-devel] spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t
Yep will have a look - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Joseph Fernandes josfe...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, July 1, 2015 1:44:44 PM Subject: spurious failures tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t hi, http://build.gluster.org/job/rackspace-regression-2GB-triggered/11757/consoleFull has the logs. Could you please look into it. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Gluster and GCC 5.1
new patch sent : http://review.gluster.org/#/c/11214/ - Original Message - From: Niels de Vos nde...@redhat.com To: Prashanth Pai p...@redhat.com Cc: Joseph Fernandes josfe...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 30, 2015 2:00:15 AM Subject: Re: [Gluster-devel] Gluster and GCC 5.1 On Mon, Jun 29, 2015 at 01:47:11PM -0400, Prashanth Pai wrote: Ah, I thought it was just me who was running into this. http://review.gluster.org/11214 Please file a decent bug with the behaviour and error messages. We need this documented for our users that will likely hit the same problem at one point. Thanks, Niels Regards, -Prashanth Pai - Original Message - From: Joseph Fernandes josfe...@redhat.com To: Gluster Devel gluster-devel@gluster.org Sent: Monday, June 29, 2015 5:54:28 PM Subject: [Gluster-devel] Gluster and GCC 5.1 Hi All, Recently I installed Fedora 22 on some fresh vms, which comes with gcc 5.1.1-1(which can be upgraded to 5.1.1-4) Observed one thing that normal inline functions will be undefined symbols in the so files. As a result I had trouble in start volume as gf_sql_str2sync_t [2015-06-29 05:52:38.491378] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-06-29 05:52:38.499205] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 0-xlator: /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t [2015-06-29 05:52:38.499229] E [MSGID: 101002] [graph.y:211:volume_type] 0-parser: Volume 'test-changetimerecorder', line 16: type 'features/changetimerecorder' is not valid or not found on this machine [2015-06-29 05:52:38.499262] E [MSGID: 101019] [graph.y:319:volume_end] 0-parser: type not specified for volume test-changetimerecorder [2015-06-29 05:52:38.499335] E [MSGID: 100026] [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph [2015-06-29 05:52:38.499470] W [glusterfsd.c:1214:cleanup_and_exit] (-- 0-: received signum (0), shutting down when gf_sql_str2sync_t was made static inline gf_sql_str2sync_t the next issue was with changelog_dispatch_vec [2015-06-29 07:11:33.367259] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-06-29 07:11:33.368816] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 0-xlator: /usr/local/lib/glusterfs/3.8dev/xlator/features/changelog.so: undefined symbol: changelog_dispatch_vec [2015-06-29 07:11:33.368829] E [MSGID: 101002] [graph.y:211:volume_type] 0-parser: Volume 'test-changelog', line 32: type 'features/changelog' is not valid or not found on this machine [2015-06-29 07:11:33.368843] E [MSGID: 101019] [graph.y:319:volume_end] 0-parser: type not specified for volume test-changelog [2015-06-29 07:11:33.368922] E [MSGID: 100026] [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph [2015-06-29 07:11:33.369025] W [glusterfsd.c:1214:cleanup_and_exit] (-- 0-: received signum (0), shutting down and so on. Looks like inline functions should be marked as static inline or extern inline explicitly Please refer https://gcc.gnu.org/gcc-5/porting_to.html To recreate the issue without glusterfs, try out this sample code program on fedora 22 gcc 5.1.1 or higher hello.c === #include stdio.h inline void foo () { printf (hello world); } int main () { foo (); return 0; } # gcc hello.c /tmp/ccUQ1XPp.o: In function `main': hello.c:(.text+0xa): undefined reference to `foo' collect2: error: ld returned 1 exit status # Should we change all the inline function to static inline or extern inline in gluster, appropriately to their scope of use (IMHO would be a right thing to do)? or should we use a compiler flag to suppress this? Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Gluster and GCC 5.1
Hi All, Recently I installed Fedora 22 on some fresh vms, which comes with gcc 5.1.1-1(which can be upgraded to 5.1.1-4) Observed one thing that normal inline functions will be undefined symbols in the so files. As a result I had trouble in start volume as gf_sql_str2sync_t [2015-06-29 05:52:38.491378] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-06-29 05:52:38.499205] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 0-xlator: /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t [2015-06-29 05:52:38.499229] E [MSGID: 101002] [graph.y:211:volume_type] 0-parser: Volume 'test-changetimerecorder', line 16: type 'features/changetimerecorder' is not valid or not found on this machine [2015-06-29 05:52:38.499262] E [MSGID: 101019] [graph.y:319:volume_end] 0-parser: type not specified for volume test-changetimerecorder [2015-06-29 05:52:38.499335] E [MSGID: 100026] [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph [2015-06-29 05:52:38.499470] W [glusterfsd.c:1214:cleanup_and_exit] (-- 0-: received signum (0), shutting down when gf_sql_str2sync_t was made static inline gf_sql_str2sync_t the next issue was with changelog_dispatch_vec [2015-06-29 07:11:33.367259] I [MSGID: 101190] [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2015-06-29 07:11:33.368816] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 0-xlator: /usr/local/lib/glusterfs/3.8dev/xlator/features/changelog.so: undefined symbol: changelog_dispatch_vec [2015-06-29 07:11:33.368829] E [MSGID: 101002] [graph.y:211:volume_type] 0-parser: Volume 'test-changelog', line 32: type 'features/changelog' is not valid or not found on this machine [2015-06-29 07:11:33.368843] E [MSGID: 101019] [graph.y:319:volume_end] 0-parser: type not specified for volume test-changelog [2015-06-29 07:11:33.368922] E [MSGID: 100026] [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph [2015-06-29 07:11:33.369025] W [glusterfsd.c:1214:cleanup_and_exit] (-- 0-: received signum (0), shutting down and so on. Looks like inline functions should be marked as static inline or extern inline explicitly Please refer https://gcc.gnu.org/gcc-5/porting_to.html To recreate the issue without glusterfs, try out this sample code program on fedora 22 gcc 5.1.1 or higher hello.c === #include stdio.h inline void foo () { printf (hello world); } int main () { foo (); return 0; } # gcc hello.c /tmp/ccUQ1XPp.o: In function `main': hello.c:(.text+0xa): undefined reference to `foo' collect2: error: ld returned 1 exit status # Should we change all the inline function to static inline or extern inline in gluster, appropriately to their scope of use (IMHO would be a right thing to do)? or should we use a compiler flag to suppress this? Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Introducing Heketi: Storage Management Framework with Plugins for GlusterFS volumes
Agreed we need not be depended ONE technology for the above. But LVM is a strong contender as a single stable underlying technology that provides the following. We can make it plugin based :) . So that ppl who have LVM and are happy with it can use it. And we still can have other technology plugins developed in parallel, but let have Single API standard defined for all. ~Joe - Original Message - From: Jeff Darcy jda...@redhat.com To: Joseph Fernandes josfe...@redhat.com Cc: Luis Pabon lpa...@redhat.com, Gluster Devel gluster-devel@gluster.org, John Spray jsp...@redhat.com Sent: Thursday, June 18, 2015 5:15:37 PM Subject: Re: Introducing Heketi: Storage Management Framework with Plugins for GlusterFS volumes LVM or Volume Manager Dependencies: 1) SNAPSHOTS: Gluster snapshots are LVM based The current implementation is LVM-centric, which is one reason uptake has been so low. The intent was always to make it more generic, so that other mechanisms could be used as well. 2) PROVISIONING and ENFORCEMENT: As of today Gluster does not have any control on the size of the brick. It will consume the brick (xfs)mount point given to it without checking on how much it needs to consume. LVM (or any other volume manager) will be required to do space provisioning per brick and enforce limits on size of bricks. Some file systems have quota, or we can enforce our own. 3) STORAGE SEGREGATION: LVM pools can be used to have storage segregation i.e having primary storage pools and secondary(for Gluster replica) pools, So that we can crave out proper space from the physical disks attached to each node. At a high level (i.e Heketi's User) disk space can be viewed as storage pools,(i.e by aggreating disk space per pool per node using glusterd) To start with we can have Primary pool and secondary pool(for Gluster replica) , where each file serving node in the cluster participates in these pools via the local LVM pools. This functionality in no way depends on LVM. In many cases, mere subdirectories are sufficient. 4) DATA PROTECTION: Further data protection using LVM RAID. Pools can be marked to have RAID support on them, courtesy of LVM RAID. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/raid_volumes.html Given that we already have replication, erasure coding, etc. many users would prefer not to reduce storage utilization even further with RAID. Others would prefer to get the same functionality without LVM, e.g. with ZFS. That's why RAID has always been - and should remain - optional. It's fine that we *can* use LVM features when and where they're available. Building in *dependencies* on it has been a mistake every time, and repeating a mistake doesn't make it anything else. JOE: ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Introducing Heketi: Storage Management Framework with Plugins for GlusterFS volumes
LVM or Volume Manager Dependencies: 1) SNAPSHOTS: Gluster snapshots are LVM based 2) PROVISIONING and ENFORCEMENT: As of today Gluster does not have any control on the size of the brick. It will consume the brick (xfs)mount point given to it without checking on how much it needs to consume. LVM (or any other volume manager) will be required to do space provisioning per brick and enforce limits on size of bricks. 3) STORAGE SEGREGATION: LVM pools can be used to have storage segregation i.e having primary storage pools and secondary(for Gluster replica) pools, So that we can crave out proper space from the physical disks attached to each node. At a high level (i.e Heketi's User) disk space can be viewed as storage pools,(i.e by aggreating disk space per pool per node using glusterd) To start with we can have Primary pool and secondary pool(for Gluster replica) , where each file serving node in the cluster participates in these pools via the local LVM pools. 4) DATA PROTECTION: Further data protection using LVM RAID. Pools can be marked to have RAID support on them, courtesy of LVM RAID. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/raid_volumes.html ~Joe - Original Message - From: Luis Pabon lpa...@redhat.com To: Jeff Darcy jda...@redhat.com Cc: John Spray john.sp...@redhat.com, storage-...@redhat.com Sent: Thursday, June 18, 2015 8:05:43 AM Subject: Re: Introducing Heketi: Storage Management Framework with Plugins for GlusterFS volumes Currently, Heketi is setup to provide volumes to Manila which requires them to have snapshot capabilities. AFAIK, the only way to do that is to use LVM backed bricks in GlusterFS. Heketi itself does not depend on LVM to create bricks and can be enhanced to support LVM-less nodes. - Luis - Original Message - From: Jeff Darcy jda...@redhat.com To: Luis Pabon lpa...@redhat.com Cc: John Spray john.sp...@redhat.com, storage-...@redhat.com Sent: Wednesday, June 17, 2015 7:13:38 PM Subject: Re: Introducing Heketi: Storage Management Framework with Plugins for GlusterFS volumes [LP] In Manila they are called 'shares', but for simplicity, I called them 'volumes' as a generic mountable network file system (NFS, CIFS, GlusterFS, whatever). The plugin for glusterfs does create fresh bricks for each volume, but that is how glusterfs works. The plugin creates these bricks on top of thinly provisioned LVs, which the plugin manages. Does this imply a dependency on LVM? --- Note: This list is intended for discussions relating to Red Hat Storage products, customers and/or support. Discussions on GlusterFS and Ceph architecture, design and engineering should go to relevant upstream mailing lists. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Patches to merge and rerun regression
Hi Vijay, Could you please merge/rerun regression on these Rafi's tiering patches Patches to merged in master: === http://review.gluster.org/10933 previously regression passed on linux and netbsd + just resolved a merge conflict http://review.gluster.org/10757 previously regression passed on linux and netbsd + the latest review was on message fix so IMHO should be good to merge without another run of regression Patches to get a re-run of regression on linux == http://review.gluster.org/#/c/11068/ Patches to be merged in 3.7 === http://review.gluster.org/#/c/11174/ http://review.gluster.org/#/c/11173/ Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] failures in tests/basic/tier/tier.t
Sure looking into it. - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Dan Lambright dlamb...@redhat.com, Joseph Fernandes josfe...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, May 7, 2015 3:24:29 PM Subject: failures in tests/basic/tier/tier.t Dan/Joseph, Could you look into it please. [22:04:31] ./tests/basic/tier/tier.t .. not ok 25 Got 1 instead of 0 not ok 26 Got 1 instead of 0 Failed 2/34 subtests [22:04:31] Test Summary Report --- ./tests/basic/tier/tier.t (Wstat: 0 Tests: 34 Failed: 2) Failed tests: 25-26 Files=1, Tests=34, 72 wallclock secs ( 0.02 usr 0.00 sys + 1.68 cusr 0.81 csys = 2.51 CPU) Result: FAIL ./tests/basic/tier/tier.t: bad status 1 ./tests/basic/tier/tier.t: 1 new core files http://build.gluster.org/job/rackspace-regression-2GB-triggered/8588/consoleFull Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] core while running tests/bugs/snapshot/bug-1112559.t
CCing Venky and Kotresh - Original Message - From: Jeff Darcy jda...@redhat.com To: Pranith Kumar Karampuri pkara...@redhat.com Cc: Joseph Fernandes josfe...@redhat.com, Avra Sengupta aseng...@redhat.com, Rajesh Joseph rjos...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, May 6, 2015 8:39:23 AM Subject: Re: [Gluster-devel] core while running tests/bugs/snapshot/bug-1112559.t Could you please look at this issue: http://build.gluster.org/job/rackspace-regression-2GB-triggered/8456/consoleFull I looked at this one for a while. It looks like a brick failed to start because changelog failed to initialize, but neither the core nor the logs shed much light on why. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Upcall state + Data Tiering
Adding more to Dan's Reply, In tiering we lose the heat of the file(collected on the source brick) when the file gets migrated, by the DHT-Rebalancer during the rebalancer withing the tier. We like to leverage the common solution infra for passing this extra metadata to destination. - Original Message - From: Dan Lambright dlamb...@redhat.com To: Niels de Vos nde...@redhat.com Cc: Joseph Fernandes josfe...@redhat.com, gluster Devel gluster-devel@gluster.org, Soumya Koduri skod...@redhat.com Sent: Monday, April 20, 2015 4:01:12 AM Subject: Re: [Gluster-devel] Upcall state + Data Tiering - Original Message - From: Niels de Vos nde...@redhat.com To: Dan Lambright dlamb...@redhat.com, Joseph Fernandes josfe...@redhat.com Cc: gluster Devel gluster-devel@gluster.org, Soumya Koduri skod...@redhat.com Sent: Sunday, April 19, 2015 9:01:56 AM Subject: Re: [Gluster-devel] Upcall state + Data Tiering On Thu, Apr 16, 2015 at 04:58:29PM +0530, Soumya Koduri wrote: Hi Dan/Joseph, As part of upcall support on the server-side, we maintain certain state to notify clients of the cache-invalidation and recall-leaselk events. We have certain known limitations with Rebalance and Self-Heal. Details in the below link - http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#Limitations In case of Cache-invalidation, upcall state is not migrated and once the rebalance is finished, the file is deleted and we may falsely notify the client that file is deleted when in reality it isn't. In case of Lease-locks, As the case with posix locks, we do not migrate lease-locks as well but will end-up recalling lease-lock. Here rebalance is an admin driven job, but that is not the case with respect to Data tiering. We would like to know when the files are moved from hot to cold tiers or vice-versa or rather when a file is considered to be migrated from cold to hot tier, where we see potential issues. Is it the first fop which triggers it? and where are the further fops processed - on hot tier or cold tier? Data tiering's basic design has been to reuse DHT's data migration algorithms. In this case, we see the same problem exists with DHT, but is a known limitation controlled by management operations. And therefore (if I follow) they may not tackle this problem right away, hence tiering may not be able to leverage their solution. It is of course desirable for data tiering to solve the problem in order to use the new upcall mechanisms. Migration of a file is a multi-state process. I/O is accepted at the same time migration is underway. I believe the upcall manager and the migration manager (for lack of better words) would have to coordinate. The former subsystem understands locks, and the later how to move files. With that coordination in mind, a basic strategy might be something like this: On the source, when a file is ready to be moved, the migration manager informs the upcall manager. The upcall manager packages relevant lock information and returns it to the migrator. The information reflects the state of posix or lease locks. The migration manager moves the file. The migration manager then sends the lock information as a virtual extended attribute. On the destination server, the upcall manager is invoked. It is passed the contents of the virtual attributes. The upcall manager rebuilds the lock state and puts the file into proper order. Only at that point, does the setxattr RPC return, and only then, shall the file be declared migrated. We would have to handle any changes to the lock state that occur when the file is in the middle of being migrated. Probably, the upcall manager would change the contents of the package. It is desirable to invent something that would work with both DHT and tiering (i.e. be implemented at the core dht-rebalance.c layer). And in fact, the mechanism I describe could be useful for other meta-data transfer applications. This is just a high level sketch, to provoke discussion and checkpoint if this is the right direction. It would take much time to sort through the details. Other ideas are welcome. My understanding is the following: - when a file is cold and gets accessed, the 1st FOP will mark the file for migration to the hot tier - migration is async, so the initial responses on FOPs would come from the cold tier - upon migration (similar to rebalance) locking state and upcall tracking is lost I think this is a problem. There seems to be a window where a client can get (posix) locks while the file is on the cold tier. After migrating the file from cold to hot, these locks would get lost. The same counts for tracking access in the upcall xlator. Please provide your inputs on the same. We may need to document the same or provide suggestions to the customers while deploying this solution. Some ideas on how this can
[Gluster-devel] Presentations: Gluster Conference @ NMAMIT, Nitte on April 11th 2015
Hi All, We have uploaded the Gluster Conference presentation http://www.gluster.org/community/documentation/index.php/Presentations We will upload the videos soon! Please also find the link to the page for GlusterFS Projects you would like to contribute. http://www.gluster.org/community/documentation/index.php/Features Please look into the Proposed Features/Ideas and Features implemented or being worked on sections for projects. If you wish to contribute to one/more of these projects, 1) Subscribe to gluster-devel and gluster-user http://www.gluster.org/mailman/listinfo/gluster-devel http://www.gluster.org/mailman/listinfo/gluster-users It will keep you updated on what is happening in the GlusterFS World plus You can post your queries/ideas/proposals/discussions/announcements on these mailing list and have GlusterFS Developers/Users interact with you folks directly! 2) Introduce yourself to the gluster-devel and mention the project you would be interested in contributing to. 3) Contact the owner of the feature project you are interested in and get details on how you would contribute to the feature project Please find the below study material that will get you started: GlusterFS Development workflow: ~~ http://www.gluster.org/community/documentation/index.php/Development_Work_Flow GlusterFS Xlator(Translators): ~ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-2-init-fini-and-private-context/ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-3-this-time-for-real/ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-4-debugging-a-translator/ GlusterFS Distribute : http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-distribution/ http://pl.atyp.us/hekafs.org/index.php/2011/04/glusterfs-extended-attributes/ GlusterFS Replication : ~ http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/ http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/ http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-future/ GlusterFS Misc: ~~ http://pl.atyp.us/hekafs.org/index.php/2013/03/glusterfs-cscope-and-vim-oh-my/ (if you are a vim lover or else codelite does the trick for me http://pl.atyp.us/hekafs.org/index.php/2013/02/gdb-macros-for-glusterfs/ http://pl.atyp.us/hekafs.org/index.php/2012/08/glupy-writing-glusterfs-translators-in-python/ Please feel free to forward this to whom so ever is interested to be the part of Gluster Community. Happy hacking ! :) Regards, Joseph Fernandes ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Presentations: Gluster Conference @ NMAMIT, Nitte on April 11th 2015
Hi All, Please find the photos of the conference https://www.facebook.com/media/set/?set=a.10152982376654900.1073741826.80854864899type=1 Regards, Joseph Fernandes - Original Message - From: Joseph Fernandes josfe...@redhat.com To: akhilrao1...@gmail.com, anjaliv...@gmail.com, anushapshett...@gmail.com, anushreeshetty1...@gmail.com, chaithrasheno...@gmail.com, dlpadmasha...@gmail.com, dilipkumar ak2 dilipkumar@gmail.com, shettyharshu...@gmail.com, manoj3121ku...@gmail.com, n u nagarajudupa n.u.nagarajud...@gmail.com, punit1221n...@gmail.com, shanubhagnan...@gmail.com, panchami 01 panchami...@rediffmail.com, mp rajesh1990 mp.rajesh1...@gmail.com, gabrielro...@gmail.com, rynar...@gmail.com, meet...@ymail.com, sachithra...@gmail.com, suvarnasahan...@gmail.com, sathyendr...@gmail.com, shanthra...@gmail.com, smit gnaik smit.gn...@gmail.com, sonals...@gmail.com, swathinaik...@rediffmail.com, violetalva...@gmail.com, adarshakmaravanth...@gmail.com, aaliyahsaye...@gmail.com, akanzna...@gmail.com, mailme sathishv mailme.sathi...@gmail.com, sudarshankarat...@gmail.com, uls...@gmail.com, ganeshagand...@gmail.com, manjunath22...@gmail.com, ashishshetty2...@gmail.com, sr93...@gmail.co m, harshabh...@gmail.com, dilipdavid...@gmail.com, ashgatty...@gmail.com, dhanusalia...@gmail.com, sunil shenoy738 sunil.shenoy...@gmail.com, nandigw...@gmail.com, aansa...@gmail.com, deepakb...@gmail.com, vinayakrocks...@gmail.com, supreethashet...@gmail.com, amushett...@gmail.com, ananyar...@gmail.com, juanitamart...@gmail.com, prashanthg...@gmail.com, nikhil2590...@gmail.com, akarshhegde...@gmail.com, supriyanayaku...@gmail.com, pratheekbkasara...@gmail.com, hjbhar...@gmail.com, utkarshroy...@gmail.com, sourabhsshe...@gmail.com, dheerajsuva...@hotmail.com, shilpashetty...@gmail.com, pradeep b2505 pradeep.b2...@gmail.com, bsvishnu...@gmail.com, balasubramany...@gmail.com, sanidhyab...@gmail.com, mahesh56...@gmail.com, vishals...@gmail.com, sarthak95agar...@gmail.com, aakash1...@gmail.com, abskmhswri er abskmhswri...@gmail.com, hegdesau...@gmail.com, Abrar nitk abrar.n...@gmail.com, kratika songara kratika.song...@gmail.com, raj95...@gmail.com, kvgupta24 @gmail.com, Debjyothi ghoshal debjyothi.ghos...@gmail.com, aashaysingha...@gmail.com, shukladheera...@gmail.com, prana...@gmail.com, atul7...@gmail.com, nagendra kr68 nagendra.k...@gmail.com, an alizaidi an.aliza...@gmail.com, pankajsharma cer pankajsharma@gmail.com, mohit x07 mohit@gmail.com, shalinilaksh...@gmail.com, rajju malhotra rajju.malho...@gmail.com, amarnaamsand...@gmail.com, prajshe...@nitk.edu.in, anketsh...@outlook.com, sharathkkot...@gmail.com, Vernon mlewis75 vernon.mlewi...@yahoo.in, ramakrishnabolisetty...@gmail.com, akshathabshe...@gmail.com, anvitha dinesh anvitha.din...@ymail.com, daisydsouz...@yahoo.com, prettyjo...@rocketmail.com, kripamaria...@gmail.com, sharylcuti...@ymail.com, shravyara...@gmail.com, nithish...@gmail.com, amara...@gmail.com, deepakrnaya...@gmail.com, gsshreed...@gmail.com, Mohan Hegde kmhe...@hotmail.com, mamatha balipa mamathabal...@gmail.com, Nmamit Karthik kus.karthik...@gmail.co m, Sukumar Poojari sukumarpoojar...@gmail.com, Ewen Pinto ewen...@gmail.com, srinivas billav srinivasbil...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Friday, April 17, 2015 3:54:37 PM Subject: Presentations: Gluster Conference @ NMAMIT, Nitte on April 11th 2015 Hi All, We have uploaded the Gluster Conference presentation http://www.gluster.org/community/documentation/index.php/Presentations We will upload the videos soon! Please also find the link to the page for GlusterFS Projects you would like to contribute. http://www.gluster.org/community/documentation/index.php/Features Please look into the Proposed Features/Ideas and Features implemented or being worked on sections for projects. If you wish to contribute to one/more of these projects, 1) Subscribe to gluster-devel and gluster-user http://www.gluster.org/mailman/listinfo/gluster-devel http://www.gluster.org/mailman/listinfo/gluster-users It will keep you updated on what is happening in the GlusterFS World plus You can post your queries/ideas/proposals/discussions/announcements on these mailing list and have GlusterFS Developers/Users interact with you folks directly! 2) Introduce yourself to the gluster-devel and mention the project you would be interested in contributing to. 3) Contact the owner of the feature project you are interested in and get details on how you would contribute to the feature project Please find the below study material that will get you started: GlusterFS Development workflow: ~~ http://www.gluster.org/community/documentation/index.php/Development_Work_Flow GlusterFS Xlator(Translators): ~ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/ http://pl.atyp.us/hekafs.org/index.php
Re: [Gluster-devel] Patch #10000
congratz Rafi :D - Original Message - From: Anoop C S achir...@redhat.com To: gluster-devel@gluster.org Sent: Thursday, March 26, 2015 7:17:19 PM Subject: Re: [Gluster-devel] Patch #1 On 03/26/2015 06:47 PM, Jeff Darcy wrote: And the winner is ... Dan Lambright! http://review.gluster.org/#/c/1 Congrats, Dan. Congrats Dan. And I would recommend for a second winner for # :) ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] GFS Architecture Docs
Hi Sidharth/Srikanth, Yes I remember the session we had on GlusterFS Cache Tier feature :) Thanks for showing interest in GlusterFS. Surely will help you in finding the GlusterFS internal links Please find some of the links which will help you (All thanks to Jeff Darcy the author of these blogs) GlusterFS Xlator(Translators): ~ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-2-init-fini-and-private-context/ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-3-this-time-for-real/ http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-lesson-4-debugging-a-translator/ GlusterFS Distribute : http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-distribution/ http://pl.atyp.us/hekafs.org/index.php/2011/04/glusterfs-extended-attributes/ GlusterFS Replication : ~ http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/ http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/ http://pl.atyp.us/hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-future/ GlusterFS Misc: ~~ http://pl.atyp.us/hekafs.org/index.php/2013/03/glusterfs-cscope-and-vim-oh-my/ (if you are a vim lover or else codelite does the trick for me ;)) http://pl.atyp.us/hekafs.org/index.php/2013/02/gdb-macros-for-glusterfs/ http://pl.atyp.us/hekafs.org/index.php/2012/08/glupy-writing-glusterfs-translators-in-python/ Please let me know if you need anymore help. Regards, Joe - Original Message - From: Sidharth Patil sidharth.pa...@gmail.com To: josfe...@redhat.com Cc: k s srikanth k_s_srika...@yahoo.com Sent: Tuesday, March 17, 2015 12:01:53 PM Subject: GFS Architecture Docs Hi Joseph, Hope you'r doing good. I and Srikanth had attended GFS meetup in Bangalore and spoke to you in detail regarding cache tiering in GFS (after the meetup session). In order to continue our interest in GFS, we spent time configuring and exploring GFS. We were able to configure GFS volumes on our laptop. We further wanted to dig in to the architecture details for our better understanding and also help us to contribute for the same, however we could not find relevant docs. Could you please point us to location where we can find one. We did look at http://www.gluster.org/community/documentation/index.php/Arch , however some of the links were broken. Any help wrt information gathering is much appreciated. We found your email id in GFS web portal and hence thought of communicating on the same. Hope you recollect discussion we had on meetup day. thanks for your time, Sidharth and Sidharth -- ...So long as the millions live in hunger and ignorance, I hold every man a traitor who, having been educated at their expense, pays not the least heed to them. --Swami Vivekananda. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Sqlite3 dependency for Data Tiering
Hi Guys, The upcoming Data tiering/Classification feature has a dependency on Sqlite3 Devel packages. Could you please suggest me, how do I include sqlite3 devel packages automatically in the GlusterFS upstream code. Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Data Classification: would it be possible to have a RAM-disk as caching tier?
Hi Niels, Well the idea is good, RAM-Disk would fastest and with no extra cost + We may have gluster brick from RAM-Disk[1] The one and the biggest challenge would be durability of data on RAM-Disks Using RAM for caching is good, as the cache will have only the copy of the original data but in case of tiering, the original data sits on the tier(not the copy). Dan your thoughts. Thanks, Joe 1. https://lists.gnu.org/archive/html/gluster-devel/2013-05/msg00118.html - Original Message - From: Niels de Vos nde...@redhat.com To: gluster-devel@gluster.org Cc: Dan Lambright dlamb...@redhat.com, Joseph Fernandes josfe...@redhat.com Sent: Monday, February 16, 2015 4:14:37 PM Subject: Data Classification: would it be possible to have a RAM-disk as caching tier? Hi guys, at FOSDEM one of our users spoke to me about their deployment and environment. It seems that they have a *very* good deal with their hardware vendor, which makes it possible to stuff their servers full with RAM for a minimal difference of the costs. They expressed interest in having a RAM-disk as caching tier on the bricks. Would a configuration like this be possible with the data-classification feature [1]? Thanks, Niels 1. http://www.gluster.org/community/documentation/index.php/Features/data-classification ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Data Classification: would it be possible to have a RAM-disk as caching tier?
JOE reply inline - Original Message - From: Niels de Vos nde...@redhat.com To: Joseph Fernandes josfe...@redhat.com Cc: gluster-devel@gluster.org, Dan Lambright dlamb...@redhat.com Sent: Monday, February 16, 2015 4:41:20 PM Subject: Re: Data Classification: would it be possible to have a RAM-disk as caching tier? On Mon, Feb 16, 2015 at 05:54:15AM -0500, Joseph Fernandes wrote: Hi Niels, Well the idea is good, RAM-Disk would fastest and with no extra cost + We may have gluster brick from RAM-Disk[1] The one and the biggest challenge would be durability of data on RAM-Disks Using RAM for caching is good, as the cache will have only the copy of the original data but in case of tiering, the original data sits on the tier(not the copy). Thanks for the swift resonse! I imagine a solution where the hot contents are not moved to the RAM-disk, but are replicated on demand. When the contents gets cold, the replication can be reduced again, which would make space on the RAM-disk. Upon boot, the RAM-disk would be empty, and only the hot-contents would need to get 'healed' onto the RAM-disk. (The fastest brick of a replica pair handles the reads, I assume that this would be the RAM-disk.) JOE Looks Interesting and might be a good idea! Few points to be careful of, 1) Network usage for AFR Self heals required to have the RAM Disk and Actual Disk, When Bricks are owned by multiple nodes. 2) This kind of replica pair should be marked separately from the regular AFR replica, say cache-replica and should work like current tiering implemetation, i.e whenever there is a cache-replica miss on the HOT Replica instead of marking it a bad replica move data intelligently(using heat patterns) to the HOT replica(Promote) from the cold replica. Well these are the things I can see for now, but the devil is the details :) At the moment I can not say if the current data-classification proposal is flexible enough to configure a policy like this. Or, how easy it would be to extend the feature to allow configurations like this as an improvement later on. JOE The current implementation this wouldnt be possible as our implementation sits above DHT layer. For the purposed cache-replica we might need to change AFR code(for current replication). Cheers, Niels Dan your thoughts. Thanks, Joe 1. https://lists.gnu.org/archive/html/gluster-devel/2013-05/msg00118.html - Original Message - From: Niels de Vos nde...@redhat.com To: gluster-devel@gluster.org Cc: Dan Lambright dlamb...@redhat.com, Joseph Fernandes josfe...@redhat.com Sent: Monday, February 16, 2015 4:14:37 PM Subject: Data Classification: would it be possible to have a RAM-disk as caching tier? Hi guys, at FOSDEM one of our users spoke to me about their deployment and environment. It seems that they have a *very* good deal with their hardware vendor, which makes it possible to stuff their servers full with RAM for a minimal difference of the costs. They expressed interest in having a RAM-disk as caching tier on the bricks. Would a configuration like this be possible with the data-classification feature [1]? Thanks, Niels 1. http://www.gluster.org/community/documentation/index.php/Features/data-classification ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] GlusterFS Volume backup API
Replies inline JOE - Original Message - From: Aravinda avish...@redhat.com To: Joseph Fernandes josfe...@redhat.com Cc: gluster Devel gluster-devel@gluster.org Sent: Friday, December 19, 2014 3:39:28 PM Subject: Re: [Gluster-devel] GlusterFS Volume backup API Thanks Joseph, added comments inline. On 12/19/2014 10:23 AM, Joseph Fernandes wrote: Few concerns inline JOE - Original Message - From: Aravinda avish...@redhat.com To: gluster Devel gluster-devel@gluster.org Sent: Thursday, December 18, 2014 10:38:20 PM Subject: [Gluster-devel] GlusterFS Volume backup API Hi, Today we discussed about GlusterFS backup API, our plan is to provide a tool/api to get list of changed files(Full/incremental) Participants: Me, Kotresh, Ajeet, Shilpa Thanks to Paul Cuzner for providing inputs about pre and post hooks available in backup utilities like NetBackup. Initial draft: == Case 1 - Registered Consumer Consumer application has to register by giving a session name. glusterbackupapi register sessionname host volume When the following command run for the first time, it will do full scan. next onwards it does incremental. Start time for incremental is last backup time, endtime will be current time. glusterbackupapi sessionname --out-file=out.txt --out-file is optional argument, default output file name is `output.txt`. Output file will have file paths. Case 2 - Unregistered Consumer - Start time and end time information will not be remembered, every time consumer has to send start time and end time if incremental. For Full backup, glusterbackupapi full host volume --out-file=out.txt For Incremental backup, glusterbackupapi inc host volume STARTTIME ENDTIME --out-file=out.txt where STARTTIME and ENDTIME are in unix timestamp format. Technical overview == 1. Using host and volume name arguments, it fetches volume info and volume status to get the list of up bricks/nodes. 2. Executes brick/node agent to get required details from brick. (TBD: communication via RPC/SSH/gluster system:: execute) 3. If full scan, brick/node agent will gets list of files from that brick backend and generates output file. 4. If incremental, it calls Changelog History API, gets distinct GFID's list and then converts each GFID to path. 5. Generated output files from each brick node will be copied to initiator node. 6. Merges all the output files from bricks and removes duplicates. 7. In case of session based access, session information will be saved by each brick/node agent. Issues/Challenges = 1. If timestamp different in gluster nodes. We are assuming, in a cluster TS will remain same. 2. If a brick is down, how to handle? We are assuming, all the bricks should be up to initiate backup(atleast one from each replica) 3. If changelog not available, or broken in between start time and end time, then how to get the incremental files list. As a prerequisite, changelog should be enabled before backup. JOE Performance overhead on IO path when changelog is switched on. I think getting numbers or a performance matrix here would be very crucial, as its not desirable to sacrifice on File IO performance to support Backup API or any data maintenance activity. We are also evaluating using FS crawl even for incremental instead of changelog. JOE BIG NO! to FS Crawl as it will is the worse thing for a spindle based storage. Your performance will deteriorate more. 4. GFID to path conversion, using `find -samefile` or using `glusterfs.pathinfo` xattr on aux-gfid-mount. 5. Deleted files, if we get GFID of a deleted file from changelog how to find path. Do backup api requires deleted files list? JOE 1) find would not be a good option here as you have to traverse through the whole namespace. Takes a toll on the spindle based media. 2) glusterfs.pathinfo xattr is a feasible approach but has its own problems, a. This xattr comes only with quota, So you need to decouple it from quota. b. This xattr should be enabled from the beginning of namespace i.e if enable later you will some file which will have this xattr and some which wont have it. This issue is true for any meta storing approach in gluster for eg : DB, Changelog etc c. I am not sure if this xattr has a support for multiple had links. I am not sure if you (the backup scenario) would require it or not. Just food for thought. d. This xattr is not crash consistent with power failures. That means you may be in a state where few inodes will have the xattr and few won't. Makes sense. These points came out of initial discussion, have to figure out how to get path efficiently using GFID. (If we use Multithreaded FS crawl even for incremental, then we don't need this conversion step.) JOE comment
Re: [Gluster-devel] GlusterFS Volume backup API
Few concerns inline JOE - Original Message - From: Aravinda avish...@redhat.com To: gluster Devel gluster-devel@gluster.org Sent: Thursday, December 18, 2014 10:38:20 PM Subject: [Gluster-devel] GlusterFS Volume backup API Hi, Today we discussed about GlusterFS backup API, our plan is to provide a tool/api to get list of changed files(Full/incremental) Participants: Me, Kotresh, Ajeet, Shilpa Thanks to Paul Cuzner for providing inputs about pre and post hooks available in backup utilities like NetBackup. Initial draft: == Case 1 - Registered Consumer Consumer application has to register by giving a session name. glusterbackupapi register sessionname host volume When the following command run for the first time, it will do full scan. next onwards it does incremental. Start time for incremental is last backup time, endtime will be current time. glusterbackupapi sessionname --out-file=out.txt --out-file is optional argument, default output file name is `output.txt`. Output file will have file paths. Case 2 - Unregistered Consumer - Start time and end time information will not be remembered, every time consumer has to send start time and end time if incremental. For Full backup, glusterbackupapi full host volume --out-file=out.txt For Incremental backup, glusterbackupapi inc host volume STARTTIME ENDTIME --out-file=out.txt where STARTTIME and ENDTIME are in unix timestamp format. Technical overview == 1. Using host and volume name arguments, it fetches volume info and volume status to get the list of up bricks/nodes. 2. Executes brick/node agent to get required details from brick. (TBD: communication via RPC/SSH/gluster system:: execute) 3. If full scan, brick/node agent will gets list of files from that brick backend and generates output file. 4. If incremental, it calls Changelog History API, gets distinct GFID's list and then converts each GFID to path. 5. Generated output files from each brick node will be copied to initiator node. 6. Merges all the output files from bricks and removes duplicates. 7. In case of session based access, session information will be saved by each brick/node agent. Issues/Challenges = 1. If timestamp different in gluster nodes. We are assuming, in a cluster TS will remain same. 2. If a brick is down, how to handle? We are assuming, all the bricks should be up to initiate backup(atleast one from each replica) 3. If changelog not available, or broken in between start time and end time, then how to get the incremental files list. As a prerequisite, changelog should be enabled before backup. JOE Performance overhead on IO path when changelog is switched on. I think getting numbers or a performance matrix here would be very crucial, as its not desirable to sacrifice on File IO performance to support Backup API or any data maintenance activity. 4. GFID to path conversion, using `find -samefile` or using `glusterfs.pathinfo` xattr on aux-gfid-mount. 5. Deleted files, if we get GFID of a deleted file from changelog how to find path. Do backup api requires deleted files list? JOE 1) find would not be a good option here as you have to traverse through the whole namespace. Takes a toll on the spindle based media. 2) glusterfs.pathinfo xattr is a feasible approach but has its own problems, a. This xattr comes only with quota, So you need to decouple it from quota. b. This xattr should be enabled from the beginning of namespace i.e if enable later you will some file which will have this xattr and some which wont have it. This issue is true for any meta storing approach in gluster for eg : DB, Changelog etc c. I am not sure if this xattr has a support for multiple had links. I am not sure if you (the backup scenario) would require it or not. Just food for thought. d. This xattr is not crash consistent with power failures. That means you may be in a state where few inodes will have the xattr and few won't. 3) Agree with the delete problem. This problem gets worse with multiple hard links. If some hard links are recorded and few are not recorded. 6. Storing session info in each brick nodes. 7. Communication channel between nodes, RPC/SSH/gluster system:: execute... etc? Kotresh, Ajeet, Please add if I missed any points. -- regards Aravinda ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Snapshot and Data Tiering
Hi All, These are the MOM of the snapshot and data tiering interops meet (apologies for the late update) 1) USS should not have problems with the changes made in DHT (DHT over DHT), as USS xlator sits above DHT. 2) With the introduction of the heat capturing DB we have few things to take care off, when a snapshot of the brick is taken a. Location of the sqlite3 files: Today the location of the sqlite3 files by default reside in the brick (brick_path/.glusterfs/) this make taking the snapshot of the db easier as it is done via LVM along with the brick. If the location is outside the brick(which is configurable eg: have all the DB files in SSD for better performance), then during taking a snapshot glusterd needs to take a manual backup of these files, which would take some time and the gluster CLI would timeout. So for the first cut we would have the DB files in the brick itself, until we have a solution for CLI timeout. b. Type of the DataBase: For the first cut we are considering only sqlite3. And sqlite3 works excellent with LVM snapshots. If a new DB type like leveldb is introduced in the future, we need to investigate on its compatibility with LVM snapshots. And this might be a deciding factor to have such a DB type in gluster. c. Check-pointing the Sqlite3 DB: Before taking a snapshot, Glusterd should issue a checkpoint command to the Sqlite3 DB to flush all the db cache on to the Disk. Action item on Data Tiering team: 1) To give the time taken to do so. i.e checkpointing time 2) Provide a generic API in libgfdb to do so OR handle the CTR xlator notification from glusterd to do checkpointing Action item on snapshot team : 1) provide hooks to call the generic API OR do the brick-ops to notify the CTR Xlator d. Snapshot aware bricks: For a brick belonging to a snapshot the CTR xlator should not record reads (which come from USS). Solution 1) send CTR Xlator notification after the snapshot brick is started to turn off recording 2) OR While the snapshot brick is started by glusterd pass a option marking the brick to be apart of snapshot. This is more generic solution. 3) The snapshot restore problem : When a snapshot is restored, 1) it will bring the volume to the point-in-time state i.e for example The current state of the volume is, HOT tier has 50 % of data COLD tier has 50 % of data. And the snapshot has the volume in the state HOT Tier has 20 % of data COLD tier has 80 % of data. A restore will bring the volume to HOT:20% COLD:80%. i.e it will undo all the promotions and demotions. This should be mentioned in the documentation. 2) In addition to this, since the restored DB has time recorded in the past, File that were considered HOT in the past are now COLD. This will have all the data moved to the COLD tier if an data tiering scanner runs after the restore of the snapshot. This should be recorded in the documentation as a recommendation that not to run the data tiering scanner immediately after a restore of snapshot. The System should be given time to learn the new heat patterns. The learning time depends on nature of work load. 4) During a data tiering activity snapshot activities like create/restore should be disables, just as it is done during adding and removing of the brick, which leads to a rebalance. Let me know if anything else is missing or any correction are required. Regards, Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!
Answers INLINE JOE - Original Message - From: Venky Shankar yknev.shan...@gmail.com To: Joseph Fernandes josfe...@redhat.com Cc: Kotresh Hiremath Ravishankar khire...@redhat.com, Gluster Devel gluster-devel@gluster.org, dlamb...@redhat.com, Vijay Bellur vbel...@redhat.com Sent: Friday, December 5, 2014 11:10:10 PM Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes On Thu, Dec 4, 2014 at 10:42 PM, Joseph Fernandes josfe...@redhat.com wrote: On the performance on the data path I have seen a 3% dip in performance, with initial implementation which is not finalized. The testing is in progress and not finalized yet as we are trying to reduce it as much as possible, with optimization in implementation and SQLite tunables . Will publish the final result as we are done with it. Sure. Venky, Could you please let us know what is the performance impact on the IO path with changelog's Sure, numbers should be out soon. JOE Thanks.. This will give us a fair idea about the delays in the IO that would be introduced if changelog (not the xlator but the recording done by xlator) is made a dependency for Data Tiering. 15 seconds by default and has proved to provide a good balance between replication performance (geo-rep) and IOPS rate configuration ? Plus on the 15 sec delay the tiering team needs to discuss on the impact on the freshness of data. As discussed to in-person and iterated MANY! times in many discussions with the changelog team, I fail to understand why you bring up this point and detail the approach now. If this was discussed many times, it should have been in this mailing list long back. 1) When we dont have geo-rep ON i.e when changelog is not ON, we will poluate the DB inline with the IO path (which we are progressively working on reducing the IO path performance hit ) 2) When Changelog is ON we will have the DB be feed by the libchangelog api. To remoce the freshness issue we can have in-memory update on a LRU, as we are not looking for a sequential update. Plus we ould need this in-memory data structure as changelog DOESNOT provide read statistics! which is required for tiering and is a VERY crucial part to detect the HOTNESS on the file! 3) As tiering is concerned we are not worried about the crash consistency as for a. File which are COLD the data is safe on the disk b. File which are HOT the data even though the data in the memory is lost, since these file will get HOT again we will move them later If they don't get HOT then the crash is without impact Probably something that might have been discussed but I cannot recall: could the objects that got evicted from the LRU/LFU be fed to the DB (or any data store)? Wouldn't that guarantee data freshness in the datastore with the cache providing the list of hot files? That way you have data store freshness (what you'd get from feeding via I/O path) and the LRU/LFU sits there as usual. Thoughts? JOE Well If you would recall the multiple internal discussion we had and we had agreed upon on this long time from the beginning.(though not recorded) and as a result of the discussion we have the Approach for the infra-structure https://gist.github.com/vshankar/346843ea529f3af35339 AFAIK, Though the doc doesn't speak of the above in details it was always the plan to do it as above. The use of the LRU/LFU is definitely the way to go both with or without changelog recording as it boasts the performance for recording. And the mention of this is in https://gist.github.com/vshankar/346843ea529f3af35339 at the end. Well you know the best as you are the author :) (Kotresh and me contributed over discussions, though not recorded, thanks for mentioning it in the gluster-devel mail :) ) As I have mentioned the development of feeding the DB in the IO path is still in work in progress. We (Dan Me) are making it more and more performant. We have also taking guidance from Ben England on testing it in parallel with development cycles so that we have the best approach implementation. That is where we are getting the numbers from (This is recorded in mails I will forward them to you). Plus we have kept Vijay Bellur in sync with the approach we are taking on a weekly basis ( though not recorded :) ) On the point of the discussion not recorded on gluster-devel, these discussion happened more frequently and in more adhoc way. Well you the best as you were part of all of them :). As we move forward we will have more discussion internally for sure and lets make sure that they are recorded so that lets not keep running around the same bush again and again ;). And Thanks for all the help in form of discussion/thoughts. Looking forward for more as we along. ~Joe Venky ~ Joseph ( NOT Josef :) ) - Original Message - From: Venky Shankar yknev.shan...@gmail.com To: Kotresh Hiremath Ravishankar
Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!
On the performance on the data path I have seen a 3% dip in performance, with initial implementation which is not finalized. The testing is in progress and not finalized yet as we are trying to reduce it as much as possible, with optimization in implementation and SQLite tunables . Will publish the final result as we are done with it. Venky, Could you please let us know what is the performance impact on the IO path with changelog's 15 seconds by default and has proved to provide a good balance between replication performance (geo-rep) and IOPS rate configuration ? Plus on the 15 sec delay the tiering team needs to discuss on the impact on the freshness of data. As discussed to in-person and iterated MANY! times in many discussions with the changelog team, 1) When we dont have geo-rep ON i.e when changelog is not ON, we will poluate the DB inline with the IO path (which we are progressively working on reducing the IO path performance hit ) 2) When Changelog is ON we will have the DB be feed by the libchangelog api. To remoce the freshness issue we can have in-memory update on a LRU, as we are not looking for a sequential update. Plus we ould need this in-memory data structure as changelog DOESNOT provide read statistics! which is required for tiering and is a VERY crucial part to detect the HOTNESS on the file! 3) As tiering is concerned we are not worried about the crash consistency as for a. File which are COLD the data is safe on the disk b. File which are HOT the data even though the data in the memory is lost, since these file will get HOT again we will move them later If they dont get HOT then the crash is without impact ~ Joseph ( NOT Josef :) ) - Original Message - From: Venky Shankar yknev.shan...@gmail.com To: Kotresh Hiremath Ravishankar khire...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org, dlamb...@redhat.com, josfe...@redhat.com, Vijay Bellur vbel...@redhat.com Sent: Thursday, December 4, 2014 8:53:43 PM Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!! [Adding Dan/Josef/Vijay] As of now, rollover-time is global to changelog translator, hence tuning that would effect all consumers subscribing to updates. It's 15 seconds by default and has proved to provide a good balance between replication performance (geo-rep) and IOPS rate. Tuning to a lower value would imply doing a round of perf test for geo-rep to be safe. The question is if data tiering can compromise on data freshness. If yes, is there a hard limit? For BitRot, it should be OK as the policy for checksum calculation is lazy. Adding a bit more lag would not hurt much. Josef, Could you share the performance numbers along with the setup (configuration, etc.) you used to measure SQLite performance inline to the data path? -Venky On Thu, Dec 4, 2014 at 3:23 PM, Kotresh Hiremath Ravishankar khire...@redhat.com wrote: Hi, As of now, geo-replication is the only consumer of the changelog. Going forward bitrot and tiering also will join as consumers. The current format of the changelog can be found in below links. http://www.gluster.org/community/documentation/index.php/Arch/Change_Logging_Translator_Design https://github.com/gluster/glusterfs/blob/master/doc/features/geo-replication/libgfchangelog.md Current Design: 1. Every changelog.rollover-time secs (configurable), a new changelog file is generated: 2. Geo-replication history API, designed as part of Snapshot requirement, maintains a HTIME file with changelog filenames generated. It is guaranteed that there is no breakage between all the changelogs within one HTIME file i.e., changelog is not enabled/disabled in between. Proposed changes for changelog as part of bitrot and tiering: 1. Add timestamp for each fop record in changelog. Rational : Tiering requires timestamp of each fop. Implication on Geo-rep: NO 2. Make one big changelog per day or so and do not rollover the changelog every rollover-time. Rational: Changing changelog.rollover-time is gonna affect all the three consumers hence decoupling is required. Geo-replication: Is fine with changing rollover time. Tiering: Not fine as per the input I got from Joseph (Joseph, please comment). as this adds up to the delay that tiering gets the change notification from changelog. Bitrot : It should be fine. (Venky, please comment). Implications on current Geo-replication Design: 1. Breaks History API: Needs redesign. 2. Changes to geo-replication changelog consumption logic ?? 3. libgfchangelog API changes. 4. Effort to handle upgrade scenarios. Bitrot and Tiering guys, Please add any more changes expected which I have missed. Point
Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!!!
Typo corrections and marking Jeff in the mail. Plus The Data Tiering team had a discussion yesterday late night and we have decided that the 15 sec delay wont kill us either. But having a READ FOP recording for data is a MUST as stated in the earlier reply. ~Joe - Original Message - From: Joseph Fernandes josfe...@redhat.com To: Venky Shankar yknev.shan...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, December 4, 2014 10:42:56 PM Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes On the performance on the data path I have seen a 3% dip in performance, with initial implementation which is not finalized. The testing is in progress and not finalized yet as we are trying to reduce it as much as possible, with optimization in implementation and SQLite tunables . Will publish the final result as we are done with it. Venky, Could you please let us know what is the performance impact on the IO path with changelog's 15 seconds by default and has proved to provide a good balance between replication performance (geo-rep) and IOPS rate configuration ? Plus on the 15 sec delay the tiering team needs to discuss on the impact on the freshness of data. As discussed to in-person and iterated MANY! times in many discussions with the changelog team, 1) When we dont have geo-rep ON i.e when changelog is not ON, we will populate the DB inline with the IO path (which we are progressively working on reducing the IO path performance hit ) 2) When Changelog is ON we will have the DB be feed by the libchangelog api. To remoce the freshness issue we can have in-memory update on a LRU, as we are not looking for a sequential update. Plus we would need this in-memory data structure as changelog DOESNOT provide read statistics! which is required for tiering and is a VERY crucial part to detect the HOTNESS on the file! 3) As tiering is concerned we are not worried about the crash consistency as for a. File which are COLD the data is safe on the disk b. File which are HOT the data even though the data in the memory is lost, since these file will get HOT again we will move them later If they dont get HOT then the crash is without impact ~ Joseph ( NOT Josef :) ) - Original Message - From: Venky Shankar yknev.shan...@gmail.com To: Kotresh Hiremath Ravishankar khire...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org, dlamb...@redhat.com, josfe...@redhat.com, Vijay Bellur vbel...@redhat.com Sent: Thursday, December 4, 2014 8:53:43 PM Subject: Re: [Gluster-devel] Discussion: Implications on geo-replication due to bitrot and tiering changes!!! [Adding Dan/Josef/Vijay] As of now, rollover-time is global to changelog translator, hence tuning that would effect all consumers subscribing to updates. It's 15 seconds by default and has proved to provide a good balance between replication performance (geo-rep) and IOPS rate. Tuning to a lower value would imply doing a round of perf test for geo-rep to be safe. The question is if data tiering can compromise on data freshness. If yes, is there a hard limit? For BitRot, it should be OK as the policy for checksum calculation is lazy. Adding a bit more lag would not hurt much. Josef, Could you share the performance numbers along with the setup (configuration, etc.) you used to measure SQLite performance inline to the data path? -Venky On Thu, Dec 4, 2014 at 3:23 PM, Kotresh Hiremath Ravishankar khire...@redhat.com wrote: Hi, As of now, geo-replication is the only consumer of the changelog. Going forward bitrot and tiering also will join as consumers. The current format of the changelog can be found in below links. http://www.gluster.org/community/documentation/index.php/Arch/Change_Logging_Translator_Design https://github.com/gluster/glusterfs/blob/master/doc/features/geo-replication/libgfchangelog.md Current Design: 1. Every changelog.rollover-time secs (configurable), a new changelog file is generated: 2. Geo-replication history API, designed as part of Snapshot requirement, maintains a HTIME file with changelog filenames generated. It is guaranteed that there is no breakage between all the changelogs within one HTIME file i.e., changelog is not enabled/disabled in between. Proposed changes for changelog as part of bitrot and tiering: 1. Add timestamp for each fop record in changelog. Rational : Tiering requires timestamp of each fop. Implication on Geo-rep: NO 2. Make one big changelog per day or so and do not rollover the changelog every rollover-time. Rational: Changing changelog.rollover-time is gonna affect all the three consumers hence decoupling is required. Geo-replication: Is fine with changing rollover time. Tiering: Not fine as per the input I got from Joseph (Joseph, please comment
Re: [Gluster-devel] Backup support for GlusterFS
Hi Aravinda, Venky, Kotresh and me had a discussion on Data Maintenance infrastructure that will help Data maintenance services like Data Tiering, Bitrot, Backup, dedupe etc to identify the data(file/directory) set to work on using sequential notification service or non-sequential recording Data Store service. Today as part of the data tiering project I have the non-sequential recording data store ready that will give you a list of file that are hotter/colder (both read/write). i,e in your scenario files that have change from the last backup. But this is just a part of the solution. As far as I know, Venky is going to come up with a elaborate document on this soon. On the NDMP side I have few question for you, 1) Are you planning to develop our own NDMP Tape and Data Service from the scratch? 2) Or are you planning to use a well established 3rd party NDMP Tape and Data Service? Well that case we need to give the list of files that need to be backup to such a software. Regards, Joe - Original Message - From: Aravinda avish...@redhat.com To: Gluster Devel gluster-devel@gluster.org Sent: Monday, December 1, 2014 5:40:53 PM Subject: [Gluster-devel] Backup support for GlusterFS Hi, We are trying to implement backup support for GlusterFS. Many Network backup utilities like Bacula(open source), Amanda(open source), Symantec NetBackup support NDMP(http://www.ndmp.org/). Comparison is available here http://wiki.bacula.org/doku.php?id=comparisons Plan is to create glusterfs-ndmp-server, which utilizes glusterfs changelogs to detect changes for incremental backup. Design is not yet finalized, Comments Suggestions Welcome. Looks like a project(https://forge.gluster.org/ndmp-server) in forge.gluster is discontinued. PS: NDMP support is not available in Open Source editions of Bacula and Amanda, but available in Enterprise Editions. References: --- 1. NDMP support in NetBackup http://www.symantec.com/business/support/index?page=contentid=DOC6456 2. NDMP Presentation http://www.ndmp.org/download/sdk_v4/ndmp-overview-r2.ppt 3. NDMP website http://www.ndmp.org/ 4. Bacula and Amanda website http://bacula.org/ and http://www.amanda.org/ -- regards Aravinda ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Proposal for Gluster Compliance Feature
Hi All, We have the feature page for Gluster compliance Archive in gluster.org http://www.gluster.org/community/documentation/index.php/Features/gluster_compliance_archive Would like thank Luis, Vivek, Jeff, Dan, Vijay and Kaleb for the valuable reviews/comments/discussions that helped in refining the solution. Looking forward for more reviews/comments/discussions from the community so that we can have a better solution for Glusterfs. Regards, ~Joe - Original Message - From: Joseph Fernandes josfe...@redhat.com To: Gluster Devel gluster-devel@gluster.org Sent: Thursday, June 26, 2014 2:35:50 PM Subject: Proposal for Gluster Compliance Feature Hi All, We are proposing Compliance feature for Gluster. The Idea is to provide the following , 1) Enable WORM/Retention Filesystem capabilities on Gluster 2) Data Maintenance (Data Validation and Data Shredding) on WORM/Retained files 3) Volume based Tiering for WORM/Retained files. The proposal is explained in the presentation attached in the mail. The notes of the presentation has details of the proposal. Looking forward for Question/Feed-Back. Thanks Regards, ~Joe ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Status of bit-rot detection
Sure :) Thanks ~Joe - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Joseph Fernandes josfe...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org, Vivek Agarwal vagar...@redhat.com, Luis Pabón lpa...@redhat.com Sent: Wednesday, July 23, 2014 6:02:17 PM Subject: Re: [Gluster-devel] Status of bit-rot detection On 2014-07-23 13:42, Joseph Fernandes wrote: Hi Anders, Currently we don't have an implementation/patch for bit-rot. We are working on the design of bit-rot protection(for read-only data), as part of Gluster Compliance. read only data is nice for archival (which is why my backups go to CDs/DVDs since 15 years back and bit-rot detection by md5 sums). Please refer to the Gluster Compliance Proposal http://supercolony.gluster.org/pipermail/gluster-devel/2014-June/041258.html If you have any design proposal/suggestion, please do share, so that we can have a discussion on it. I'm more interested in periodically (or triggered by writes) scan and checksum all/parts of the files on gluster volumes, and compare those checksums between replicas (wont work for open files like databases/VM-images). I'll guess that I put my current tools onto each brick, and whip up some scripts to compare those. When something materializes, I'm interested in testing. Regards, Joe - Original Message - From: Anders Blomdell anders.blomd...@control.lth.se To: Gluster Devel gluster-devel@gluster.org Sent: Monday, July 21, 2014 10:42:00 PM Subject: [Gluster-devel] Status of bit-rot detection Since switching to xfs have left me with a seemingly working system :-), what is the current status on bit-rot detection ( http://www.gluster.org/community/documentation/index.php/Arch/BitRot_Detection), any patches for me to try? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] spurious regression failures again!
Hi Avra, Just clarifying things here, 1) When testing with the setup provide by Justin, I found the only place where bug-1112559.t failed was after the failure mgmt_v3-locks.t in the previous regression run. The mail attached with the previous mail was just an OBSERVATION and NOT an INFERENCE that failure of mgmt_v3-locks.t was the root cause of bug-1112559.t . I am NOT jumping the gun and making any statement/conclusion here. Its just an OBSERVATION. And thanks for the clarification on why mgmt_v3-locks.t is failing. 2) I agree with you that the cleanup script needs to kill all gluster* processes. And its also true that port range used by gluster for bricks is unique. But bug-1112559.t fails only because of the unavailability of port, to start the snap brick. Therefore this suggests that there is some process(gluster or non-gluster) still using the port. 3) And Finally that bug-1112559.t failing individually all the time is not true as when looked into the links which you have provided there are case where there are previous other test case failures, on the same testing machine (slave26). By this I am not pointing out that those failure are the root cause for bug-1112559.t to fail As stated earlier its a notable OBSERVATION(Keeping in mind point 2 about ports and cleanup) I have run nearly 30 runs on slave30 and only one time bug-1112559.t failed (As stated in point 1). I am continuing to run more runs. The only problem is the occurrence of bug-1112559.t failure is spurious and there is no deterministic way of reproducing it. Will keep all posted about the results. Regards, Joe - Original Message - From: Avra Sengupta aseng...@redhat.com To: Joseph Fernandes josfe...@redhat.com, Pranith Kumar Karampuri pkara...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org, Varun Shastry vshas...@redhat.com, Justin Clift jus...@gluster.org Sent: Wednesday, July 16, 2014 1:03:21 PM Subject: Re: [Gluster-devel] spurious regression failures again! Joseph, I am not sure I understand how this is affecting the spurious failure of bug-1112559.t. As per the mail you have attached, and according to your analysis, bug-1112559.t fails because a cleanup hasn't happened properly after a previous test-case failed and in your case there was a crash as well. Now out of all the times bug-1112559.t has failed, most of the time it's the only test case failing and there isn't any crash. Below are the regression runs that pranith had sent for the same. http://build.gluster.org/job/rackspace-regression-2GB/541/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/173/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/172/consoleFull http://build.gluster.org/job/rackspace-regression-2GB/543/console In all of the above bug-1112559.t is the only test case that fails and there is no crash. So what I fail to understand here is, if this particular testcase fails independently as well as with other testcases, then how can we conclude that any other testcase failing is somehow not doing a cleanup properly and that is the reason for bug-1112559.t failing. mgmt_v3-locks.t fails because glusterd takes more time to register a node going down, and hence the peer status doesn't return what the testcase expects it to. It's a race. The testcase ends with a cleanup routine like every other testcase, that kills all gluster and glusterfsd processes, which might be using any brick ports. So could you please explain how or which process still uses the brick ports that the snap bricks are trying to use leading to the failure of bug-1112559.t. Regards, Avra On 07/15/2014 09:57 PM, Joseph Fernandes wrote: Just pointing out , 2) tests/basic/mgmt_v3-locks.t - Author: Avra http://build.gluster.org/job/rackspace-regression-2GB-triggered/375/consoleFull This is the similar kind of error I saw in my testing of spurious failure tests/bugs/bug-1112559.t Please refer the attached mail. Regards, Joe - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Joseph Fernandes josfe...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org, Varun Shastry vshas...@redhat.com Sent: Tuesday, July 15, 2014 9:34:26 PM Subject: Re: [Gluster-devel] spurious regression failures again! On 07/15/2014 09:24 PM, Joseph Fernandes wrote: Hi Pranith, Could you please share the link of the console output of the failures. Added them inline. Thanks for reminding :-) Pranith Regards, Joe - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Gluster Devel gluster-devel@gluster.org, Varun Shastry vshas...@redhat.com Sent: Tuesday, July 15, 2014 8:52:44 PM Subject: [Gluster-devel] spurious regression failures again! hi, We have 4 tests failing once in a while causing problems: 1) tests/bugs/bug-1087198.t - Author: Varun http://build.gluster.org/job/rackspace-regression
Re: [Gluster-devel] spurious regression failures again!
Hi Pranith, Could you please share the link of the console output of the failures. Regards, Joe - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: Gluster Devel gluster-devel@gluster.org, Varun Shastry vshas...@redhat.com Sent: Tuesday, July 15, 2014 8:52:44 PM Subject: [Gluster-devel] spurious regression failures again! hi, We have 4 tests failing once in a while causing problems: 1) tests/bugs/bug-1087198.t - Author: Varun 2) tests/basic/mgmt_v3-locks.t - Author: Avra 3) tests/basic/fops-sanity.t - Author: Pranith Please take a look at them and post updates. Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t
Hi All, Thanks Justin for the setup(slave30). Executed the whole regression suite on slave30 multiple times. Once there was a failure of ./tests/basic/mgmt_v3-locks.t with a core http://build.gluster.org/job/rackspace-regression-2GB-joe/12/console Test Summary Report --- ./tests/basic/mgmt_v3-locks.t (Wstat: 0 Tests: 14 Failed: 3) Failed tests: 11-13 Files=250, Tests=4897, 3968 wallclock secs ( 1.91 usr 1.41 sys + 330.06 cusr 457.27 csys = 790.65 CPU) Result: FAIL + RET=1 ++ ls -l /core.20215 ++ wc -l There is a glusterd crash Log files and core files are available @ http://build.gluster.org/job/rackspace-regression-2GB-joe/12/console And the very next regression test bug-1112559.t failed with the same port unavailability. http://build.gluster.org/job/rackspace-regression-2GB-joe/13/console After this I restart slave30 and executed the whole regression test again and never hit his issue. Looks like the issue is not originated @ bug-1112559.t. The failure in bug-1112559.t test 10 is the result because of a previous failure or crash. Regards, Joe - Original Message - From: Justin Clift jus...@gluster.org To: Vijay Bellur vbel...@redhat.com Cc: Avra Sengupta aseng...@redhat.com, Joseph Fernandes josfe...@redhat.com, Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, July 10, 2014 8:26:49 PM Subject: Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t On 10/07/2014, at 12:44 PM, Vijay Bellur wrote: snip A lot of regression runs are failing because of this test unit. Given feature freeze is around the corner, shall we provide a +1 verified manually for those patchsets that fail this test? Went through and did this manually, as Gluster Build System. Also got Joe set up so he can debug things on a Rackspace VM to find out what's wrong. + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t
Hi All, 1) Tried reproducing the issue in local setup by running the regression test multiple times in a for loop. But the issue never hit! 2) As Avra pointed out the logs suggests that the port(49159) assigned by the glusterd(host1) to the snap brick is already in use by some other process 3) For time being I can comment out the TEST that is failing i,e comment the checking of the status of snap brick so the regression test doesnt block any check-in 4) If we can get rackspace system where actually the regression tests are run, We can reproduce and point out the root cause. Regards, ~Joe - Original Message - From: Justin Clift jus...@gluster.org To: Niels de Vos nde...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, July 10, 2014 6:25:16 PM Subject: Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t On 10/07/2014, at 1:41 PM, Niels de Vos wrote: On Thu, Jul 10, 2014 at 05:14:08PM +0530, Vijay Bellur wrote: snip A lot of regression runs are failing because of this test unit. Given feature freeze is around the corner, shall we provide a +1 verified manually for those patchsets that fail this test? I don't think that is easily possible. We also need to remove the -1 verified that the Gluster Build System sets. I'm not sure how we should be doing that. Maybe its better to disable (parts of) the test-case? We can set results manually as the Gluster Build System by using the gerrit command from build.gluster.org. Looking at the failure here: http://build.gluster.org/job/rackspace-regression-2GB-triggered/276/console At the bottom, it shows this was the command run to communicate failure: $ ssh bu...@review.gluster.org gerrit review --message ''\'' http://build.gluster.org/job/rackspace-regression-2GB-triggered/276/consoleFull : FAILED'\''' --project=glusterfs --verified=-1 --code-review=0 d8296086ddaf7ef4a4667f5cec413d64a56fd382 So, we run the same thing from the jenkins user on build.gluster.org, but change the result bits to +1 and SUCCESS. And a better message: $ sudo su - jenkins $ ssh bu...@review.gluster.org gerrit review --message ''\'' Ignoring previous spurious failure : SUCCESS'\''' --project=glusterfs --verified=+1 --code-review=0 d8296086ddaf7ef4a4667f5cec413d64a56fd382 Seems to work: http://review.gluster.org/#/c/8285/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t
Sent a patch that temporarily disables the failing TEST http://review.gluster.org/#/c/8259/ - Original Message - From: Joseph Fernandes josfe...@redhat.com To: Justin Clift jus...@gluster.org Cc: Niels de Vos nde...@redhat.com, Gluster Devel gluster-devel@gluster.org, Vijay Bellur vbel...@redhat.com Sent: Thursday, July 10, 2014 6:57:34 PM Subject: Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t Hi All, 1) Tried reproducing the issue in local setup by running the regression test multiple times in a for loop. But the issue never hit! 2) As Avra pointed out the logs suggests that the port(49159) assigned by the glusterd(host1) to the snap brick is already in use by some other process 3) For time being I can comment out the TEST that is failing i,e comment the checking of the status of snap brick so the regression test doesnt block any check-in 4) If we can get rackspace system where actually the regression tests are run, We can reproduce and point out the root cause. Regards, ~Joe - Original Message - From: Justin Clift jus...@gluster.org To: Niels de Vos nde...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, July 10, 2014 6:25:16 PM Subject: Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t On 10/07/2014, at 1:41 PM, Niels de Vos wrote: On Thu, Jul 10, 2014 at 05:14:08PM +0530, Vijay Bellur wrote: snip A lot of regression runs are failing because of this test unit. Given feature freeze is around the corner, shall we provide a +1 verified manually for those patchsets that fail this test? I don't think that is easily possible. We also need to remove the -1 verified that the Gluster Build System sets. I'm not sure how we should be doing that. Maybe its better to disable (parts of) the test-case? We can set results manually as the Gluster Build System by using the gerrit command from build.gluster.org. Looking at the failure here: http://build.gluster.org/job/rackspace-regression-2GB-triggered/276/console At the bottom, it shows this was the command run to communicate failure: $ ssh bu...@review.gluster.org gerrit review --message ''\'' http://build.gluster.org/job/rackspace-regression-2GB-triggered/276/consoleFull : FAILED'\''' --project=glusterfs --verified=-1 --code-review=0 d8296086ddaf7ef4a4667f5cec413d64a56fd382 So, we run the same thing from the jenkins user on build.gluster.org, but change the result bits to +1 and SUCCESS. And a better message: $ sudo su - jenkins $ ssh bu...@review.gluster.org gerrit review --message ''\'' Ignoring previous spurious failure : SUCCESS'\''' --project=glusterfs --verified=+1 --code-review=0 d8296086ddaf7ef4a4667f5cec413d64a56fd382 Seems to work: http://review.gluster.org/#/c/8285/ + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t
Hi Pranith, I am looking into this issue. Will keep you posted on the process by EOD Regards, ~Joe - Original Message - From: Pranith Kumar Karampuri pkara...@redhat.com To: josfe...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org, Rajesh Joseph rjos...@redhat.com, Sachin Pandit span...@redhat.com, aseng...@redhat.com Sent: Monday, July 7, 2014 8:42:24 PM Subject: Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t On 07/07/2014 06:18 PM, Pranith Kumar Karampuri wrote: Joseph, Any updates on this? It failed 5 regressions today. http://build.gluster.org/job/rackspace-regression-2GB/541/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/175/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/173/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/166/consoleFull http://build.gluster.org/job/rackspace-regression-2GB-triggered/172/consoleFull One more : http://build.gluster.org/job/rackspace-regression-2GB/543/console Pranith CC some more folks who work on snapshot. Pranith On 07/05/2014 11:19 AM, Pranith Kumar Karampuri wrote: hi Joseph, The test above failed on a documentation patch, so it has got to be a spurious failure. Check http://build.gluster.org/job/rackspace-regression-2GB-triggered/150/consoleFull for more information Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel