Re: [Gluster-devel] Adventures in building GlusterFS
I've update Vijay's patch with the necessary changes. The smoke and related jobs have passed [1], regression is running currently [2] and netbsd-regression is yet to start. [1]: http://build.gluster.org/job/smoke/14963/ [2]: http://build.gluster.org/job/rackspace-regression-2GB-triggered/6292 On Thu, Apr 2, 2015 at 10:14 AM, Kaushal M wrote: > I should have caught this during the review. I've been on the lookout > for this kind of breakage in all GlusterD changes, but this slipped > through somehow. Apologies. > > I don't think we need to revert KP's change. I'll test Vijay's fix and > do any further changes required to stop the failures, ASAP. > > ~kaushal > > On Thu, Apr 2, 2015 at 9:26 AM, Jeff Darcy wrote: >>> Apologies for breaking the build. I am out of office. Please revert >>> review #9492. >> >> I'm not sure that it was so much one patch breaking the build as >> some sort of "in-flight collision" merge weirdness. In any case, >> don't worry about it. Stuff happens. The important thing is to >> get the build/regression pipeline going again, however we can. >> ___ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
I should have caught this during the review. I've been on the lookout for this kind of breakage in all GlusterD changes, but this slipped through somehow. Apologies. I don't think we need to revert KP's change. I'll test Vijay's fix and do any further changes required to stop the failures, ASAP. ~kaushal On Thu, Apr 2, 2015 at 9:26 AM, Jeff Darcy wrote: >> Apologies for breaking the build. I am out of office. Please revert >> review #9492. > > I'm not sure that it was so much one patch breaking the build as > some sort of "in-flight collision" merge weirdness. In any case, > don't worry about it. Stuff happens. The important thing is to > get the build/regression pipeline going again, however we can. > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Multiple verify in gerrit
Hi I am now convinced the solution to our multiple regression problem is to introduce more "Gluster Build System" users: one for CentOS regression, another one for NetBSD regression (and one for each smoke test, as exaplained below). I just tested it on http://review.gluster.org/10052, and here is what gerrit display in the verified column - if there are neither verified=+1 or verified=-1 cast: nothing - if there is at least one verified=+1 and no verified=-1: verified - if there is at least one verified=-1: failed Therefore if CentOS regression uses bu...@review.gluster.org to report results and NetBSD regression uses nb7bu...@review.gluster.org (later user should be created), we acheive this outcome: - gerrit will display a change as verified if one regression reported it as verified and the other either also succeeded or failed to report - gerrit will display a change as failed if one regression reported it at failed, regardless of what the other reported. There is still one minor problem: if one regression does not report, or report late, we can have the feeling that a change is verified while it should not, and its status can change later. But this is a minor issue compaed to curent status. Other ideas: - smoke builds should also report as different gerrit users, so that a verified=+1 regression result does not override verified=-1 smoke build result - when we get a regression failure, we could cast the verified vote to gerrit and immediatly schedule another regression run. That way we could automatically workaround spurious failures without the need for retrigger in Jenkins. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
> Apologies for breaking the build. I am out of office. Please revert > review #9492. I'm not sure that it was so much one patch breaking the build as some sort of "in-flight collision" merge weirdness. In any case, don't worry about it. Stuff happens. The important thing is to get the build/regression pipeline going again, however we can. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Adventures in building GlusterFS
Apologies for breaking the build. I am out of office. Please revert review #9492. - Original Message - > As many of you have undoubtedly noticed, we're now in a situation where > *all* regression builds are now failing, with something like this: > > - > cc1: warnings being treated as errors > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: > In function ‘glusterd_snap_quorum_check_for_create’: > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: > error: passing argument 2 of ‘does_gd_meet_server_quorum’ from > incompatible pointer type > /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: > note: expected ‘struct list_head *’ but argument is of type ‘struct > cds_list_head *’ > - > > The reason is that -Werror was turned on earlier today. I'm not quite > sure how or where, because the version of build.sh that I thought builds > would use doesn't seem to have changed since September 8, but then > there's a lot about this system I don't understand. Vijay (who I > believe made the change) knows it better than I ever will. In any case, > this started me on a little journey of exploration. > > I actually do builds a bit differently than Jenkins does, and as far as > I know differently than any other developer. I do a complete RPM build > for every change, precisely to catch problems in that pipeline, but that > means some of the things I do might not be applicable or even safe when > "make install" is issued directly instead. > > The first thing I had to do was add a couple of exceptions to -Werror > because of warnings that have been with us for ages. Specifically: > > -Wno-error=cpp because multiple things generate warnings about > _BSD_SOURCE and _SVID_SOURCE being deprecated > > -Wno-error=maybe-uninitialized because some of the qemu code is > bad that way > > That got me to the point where I could see - and hopefully debug - > today's issue. As far as I can tell, the types changed with this patch: > > http://review.gluster.org/#/c/9492/ > glusterd: group server-quorum related code together > > There's also a patch to fix the type mismatch that leads to the build > error: > > http://review.gluster.org/#/c/10105/ > mgmt/glusterd: set right definition of does_gd_meet_server_quorum() > > Unfortunately, applying the later patch to my tree didn't solve the > problem. I got similar errors in another related set of functions, > indicating that the type mismatch had just been pushed to a different > level. However, by *reverting* the first patch, along with the flag > changes mentioned above, I was able to get a successful build. > > My recommendations: > > (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized > changes wherever they need to be applied so that they're > effective during normal regression builds > > (2) Revert patch 9492 > > (3) Once regressions are running again, figure out how to make > the necessary code changes so that (1) and (2) are no longer > necessary > > I'm unable to do either of these things myself. Would anyone else like > to do so, or suggest an alternative remedy? > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
Jeff Darcy wrote: > > It does. The memory corruption disapeared and the test can complete. > > Interesting. FWIW I made 147 successful runs in a row with the patch, while it always failed without. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
> > I found one issue that local is not allocated using GF_CALLOC and with a > > mem-type. > > This is a patch which *might* fix it. > > It does. The memory corruption disapeared and the test can complete. Interesting. I suspect this means that we *are* in the case where the previous comment came from. Mem_get can allocate objects two ways: * As one of many objects in a slab, tracking internally. * As a singleton, directly via GF_*ALLOC. In mem_put, we do some pretty nasty pointer arithmetic to figure out which way an object was allocated. If we get it wrong, and therefore use the wrong *de*allocate method (either way I believe) then we'll corrupt memory. The symptoms so far suggest that an object was allocated within a slab, then deallocated as a singleton (causing its memory to be poisoned). That sucks. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
Raghavendra Talur wrote: > I found one issue that local is not allocated using GF_CALLOC and with a > mem-type. > This is a patch which *might* fix it. It does. The memory corruption disapeared and the test can complete. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
Justin Clift wrote: > > What about ading another nb7build user in gerrit? That way results will > > not conflict. > > I'm not sure. What makes you doubt? When multiple user cast votes, they do not overide each others? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Adventures in building GlusterFS
As many of you have undoubtedly noticed, we're now in a situation where *all* regression builds are now failing, with something like this: - cc1: warnings being treated as errors /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: error: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ - The reason is that -Werror was turned on earlier today. I'm not quite sure how or where, because the version of build.sh that I thought builds would use doesn't seem to have changed since September 8, but then there's a lot about this system I don't understand. Vijay (who I believe made the change) knows it better than I ever will. In any case, this started me on a little journey of exploration. I actually do builds a bit differently than Jenkins does, and as far as I know differently than any other developer. I do a complete RPM build for every change, precisely to catch problems in that pipeline, but that means some of the things I do might not be applicable or even safe when "make install" is issued directly instead. The first thing I had to do was add a couple of exceptions to -Werror because of warnings that have been with us for ages. Specifically: -Wno-error=cpp because multiple things generate warnings about _BSD_SOURCE and _SVID_SOURCE being deprecated -Wno-error=maybe-uninitialized because some of the qemu code is bad that way That got me to the point where I could see - and hopefully debug - today's issue. As far as I can tell, the types changed with this patch: http://review.gluster.org/#/c/9492/ glusterd: group server-quorum related code together There's also a patch to fix the type mismatch that leads to the build error: http://review.gluster.org/#/c/10105/ mgmt/glusterd: set right definition of does_gd_meet_server_quorum() Unfortunately, applying the later patch to my tree didn't solve the problem. I got similar errors in another related set of functions, indicating that the type mismatch had just been pushed to a different level. However, by *reverting* the first patch, along with the flag changes mentioned above, I was able to get a successful build. My recommendations: (1) Apply the -Wno-error=cpp and -Wno-error=maybe-uninitialized changes wherever they need to be applied so that they're effective during normal regression builds (2) Revert patch 9492 (3) Once regressions are running again, figure out how to make the necessary code changes so that (1) and (2) are no longer necessary I'm unable to do either of these things myself. Would anyone else like to do so, or suggest an alternative remedy? ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
> As I understand it, mem_get0 is a valid (and even more > efficient) way to allocate such objects. The frame cleanup > code should recognize which method to use when deallocating. > If that's broken, we're going to have more numerous and > serious problems than this. I'll look into it further. I don't see anything obviously wrong, but I did find this gem of a comment in mem_get: * I am working around this by performing a regular allocation * , just the way the caller would've done when not using the * mem-pool. That also means, we're not padding the size with * the list_head structure because, this will not be added to * the list of chunks that belong to the mem-pool allocated * initially. * * This is the best we can do without adding functionality for * managing multiple slabs. That does not interest us at present * because it is too much work knowing that a better slab * allocator is coming RSN. Now I'm curious to find out what effect your change will have, but I suspect we'll still be a while figuring this out. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
> - local = mem_get0(this->local_pool); > + local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local); As I understand it, mem_get0 is a valid (and even more efficient) way to allocate such objects. The frame cleanup code should recognize which method to use when deallocating. If that's broken, we're going to have more numerous and serious problems than this. I'll look into it further. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 20:22, Vijay Bellur wrote: > On 04/02/2015 12:46 AM, Justin Clift wrote: >> On 1 Apr 2015, at 20:09, Vijay Bellur wrote: >> >>> My sanity run got blown due to this as I use -Wall -Werror during >>> compilation. >>> >>> Submitted http://review.gluster.org/10105 to correct this. >> >> Should we add -Wall -Werror to the compile options for our CentOS 6.x >> regression runs? > > I would prefer doing that for CentOS 6.x at least. k, that's been done. All of the regression tests current queued up for master and release-3.6 are probably going to self destruct now though. (just thought of that. oops) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 04/02/2015 12:46 AM, Justin Clift wrote: On 1 Apr 2015, at 20:09, Vijay Bellur wrote: My sanity run got blown due to this as I use -Wall -Werror during compilation. Submitted http://review.gluster.org/10105 to correct this. Should we add -Wall -Werror to the compile options for our CentOS 6.x regression runs? I would prefer doing that for CentOS 6.x at least. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 20:09, Vijay Bellur wrote: > My sanity run got blown due to this as I use -Wall -Werror during compilation. > > Submitted http://review.gluster.org/10105 to correct this. Should we add -Wall -Werror to the compile options for our CentOS 6.x regression runs? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 04/02/2015 12:28 AM, Justin Clift wrote: On 1 Apr 2015, at 19:51, Shyam wrote: On 04/01/2015 02:47 PM, Jeff Darcy wrote: When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? This is *not* the same as others I've seen. There are no threads in the usual connection-cleanup/list_del code. Rather, it looks like some are in generic malloc code, possibly indicating some sort of arena corruption. This looks like the other core I saw yesterday, which was not the usual connection cleanup stuff. Adding this info here, as this brings this core count upto 2. One here, and the other in core.16937 : http://ded.ninja/gluster/blk0/ Oh, I just noticed there's a bunch of compile warnings at the top of the regression run: libtool: install: warning: relinking `server.la' /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2788: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2803: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c: In function ‘glusterd_get_quorum_cluster_counts’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:230: warning: comparison of distinct pointer types lacks a cast /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:236: warning: comparison of distinct pointer types lacks a cast libtool: install: warning: relinking `glusterd.la' libtool: install: warning: relinking `posix-acl.la' My sanity run got blown due to this as I use -Wall -Werror during compilation. Submitted http://review.gluster.org/10105 to correct this. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 17:38, Emmanuel Dreyfus wrote: > Justin Clift wrote: > >> We need some kind of solution. > > What about ading another nb7build user in gerrit? That way results will > not conflict. I'm not sure. However, Vijay's now added me as an admin in our production Gerrit instalce, and I have the process for restoring our backups in a local VM (on my desktop) worked out now. So... I can test this tomorrow morning and try it out. Then we'll know for sure. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
believe that this problem may be with the version of the old Gerrit, because I Gerrit in 2.10.2 version and Jenkins in 1.596.1 and squeegee build tests that take up to 96 hours in codesonar and return the voting Gerrit verified +1 or -1. firemanxbr On Wed, Apr 1, 2015 at 1:38 PM, Emmanuel Dreyfus wrote: > Justin Clift wrote: > > > We need some kind of solution. > > What about ading another nb7build user in gerrit? That way results will > not conflict. > > -- > Emmanuel Dreyfus > http://hcpnet.free.fr/pubz > m...@netbsd.org > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 1 Apr 2015, at 19:51, Shyam wrote: > On 04/01/2015 02:47 PM, Jeff Darcy wrote: >>> When doing an initial burn in test (regression run on master head >>> of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. >>> (yeah, I'm reusing VM names) >>> >>> http://build.gluster.org/job/regression-test-burn-in/16/console >>> >>> Does anyone have time to check the coredump, and see if this is >>> the bug we already know about? >> >> This is *not* the same as others I've seen. There are no threads in the >> usual connection-cleanup/list_del code. Rather, it looks like some are >> in generic malloc code, possibly indicating some sort of arena corruption. > > This looks like the other core I saw yesterday, which was not the usual > connection cleanup stuff. Adding this info here, as this brings this core > count upto 2. > > One here, and the other in core.16937 : http://ded.ninja/gluster/blk0/ Oh, I just noticed there's a bunch of compile warnings at the top of the regression run: libtool: install: warning: relinking `server.la' /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check_for_create’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2615: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c: In function ‘glusterd_snap_quorum_check’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2788: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:2803: warning: passing argument 2 of ‘does_gd_meet_server_quorum’ from incompatible pointer type /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.h:56: note: expected ‘struct list_head *’ but argument is of type ‘struct cds_list_head *’ /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c: In function ‘glusterd_get_quorum_cluster_counts’: /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:230: warning: comparison of distinct pointer types lacks a cast /home/jenkins/root/workspace/regression-test-burn-in/xlators/mgmt/glusterd/src/glusterd-server-quorum.c:236: warning: comparison of distinct pointer types lacks a cast libtool: install: warning: relinking `glusterd.la' libtool: install: warning: relinking `posix-acl.la' Related / smoking-gun? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift wrote: > On 1 Apr 2015, at 10:57, Emmanuel Dreyfus wrote: > > Hi > > > > crypt.t was recently broken in NetBSD regression. The glusterfs returns > > a node with file type invalid to FUSE, and that breaks the test. > > > > After running a git bisect, I found the offending commit after which > > this behavior appeared: > >8a2e2b88fc21dc7879f838d18cd0413dd88023b7 > >mem-pool: invalidate memory on GF_FREE to aid debugging > > > > This means the bug has always been there, but this debugging aid > > caused it to be reliable. > > Sounds like that commit is a good win then. :) > > Harsha/Pranith/Lala, your names are on the git blame for crypt.c... > any ideas? :) > > I found one issue that local is not allocated using GF_CALLOC and with a mem-type. This is a patch which *might* fix it. diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h b/xlators/encryption/crypt/src/crypt-mem-types.h index 2eab921..c417b67 100644 --- a/xlators/encryption/crypt/src/crypt-mem-types.h +++ b/xlators/encryption/crypt/src/crypt-mem-types.h @@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ { gf_crypt_mt_key, gf_crypt_mt_iovec, gf_crypt_mt_char, +gf_crypt_mt_local, gf_crypt_mt_end, }; diff --git a/xlators/encryption/crypt/src/crypt.c b/xlators/encryption/crypt/src/crypt.c index ae8cdb2..63c0977 100644 --- a/xlators/encryption/crypt/src/crypt.c +++ b/xlators/encryption/crypt/src/crypt.c @@ -48,7 +48,7 @@ static crypt_local_t *crypt_alloc_local(call_frame_t *frame, xlator_t *this, { crypt_local_t *local = NULL; - local = mem_get0(this->local_pool); +local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local); if (!local) { gf_log(this->name, GF_LOG_ERROR, "out of memory"); return NULL; Niels should be able to recognize if this is sufficient fix or not. Thanks, Raghavendra Talur > + Justin > > -- > GlusterFS - http://www.gluster.org > > An open source, distributed file system scaling to several > petabytes, and handling thousands of clients. > > My personal twitter: twitter.com/realjustinclift > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > -- *Raghavendra Talur * ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
On 04/01/2015 02:47 PM, Jeff Darcy wrote: When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? This is *not* the same as others I've seen. There are no threads in the usual connection-cleanup/list_del code. Rather, it looks like some are in generic malloc code, possibly indicating some sort of arena corruption. This looks like the other core I saw yesterday, which was not the usual connection cleanup stuff. Adding this info here, as this brings this core count upto 2. One here, and the other in core.16937 : http://ded.ninja/gluster/blk0/ ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Coredump in master :/
> When doing an initial burn in test (regression run on master head > of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. > (yeah, I'm reusing VM names) > > http://build.gluster.org/job/regression-test-burn-in/16/console > > Does anyone have time to check the coredump, and see if this is > the bug we already know about? This is *not* the same as others I've seen. There are no threads in the usual connection-cleanup/list_del code. Rather, it looks like some are in generic malloc code, possibly indicating some sort of arena corruption. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Coredump in master :/
Hi us, Adding some more CentOS 6.x regression testing VM's at the moment, to cope with the current load. When doing an initial burn in test (regression run on master head of GlusterFS git), it coredumped on the new "slave23.cloud.gluster.org" VM. (yeah, I'm reusing VM names) http://build.gluster.org/job/regression-test-burn-in/16/console Does anyone have time to check the coredump, and see if this is the bug we already know about? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 2015-04-01 at 15:48 +0200, Niels de Vos wrote: > On Wed, Apr 01, 2015 at 06:59:15PM +0530, Vijay Bellur wrote: > > On 04/01/2015 05:44 PM, Tom Callaway wrote: > > >Hello Gluster Ant People! > > > > > >Right now, if you go to gluster.org, you see our current slogan in giant > > >text: > > > > > >Write once, read everywhere > > > > > >However, no one seems to be super-excited about that slogan. It doesn't > > >really help differentiate gluster from a portable hard drive or a > > >paperback book. I am going to work with Red Hat's branding geniuses to > > >come up with some possibilities, but sometimes, the best ideas come from > > >the people directly involved with a project. > > > > > >What I am saying is that if you have a slogan idea for Gluster, I want > > >to hear it. You can reply on list or send it to me directly. I will > > >collect all the proposals (yours and the ones that Red Hat comes up > > >with) and circle back around for community discussion in about a month > > >or so. > > > > > > > I also think that we should start calling ourselves Gluster or GlusterDS > > (Gluster Distributed Storage) instead of GlusterFS by default. We are > > certainly not file storage only, we have object, api & block interfaces too > > and the FS in GlusterFS seems to imply a file storage connotation alone. > > My preference goes to Gluster, to capture the whole community. I also would prefer Gluster over GlusterDS. regarding the slogan, I am thinking about something that incorporates SOS - [S]cale [O]ut [S]torage ... Cheers - Michael pgpnsc2Z02y9H.pgp Description: PGP signature ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Extra overnight regression test run results
On 1 Apr 2015, at 03:48, Justin Clift wrote: > On 31 Mar 2015, at 14:18, Shyam wrote: > >>> Also, most of the regression runs produced cores. Here are >>> the first two: >>> >>> http://ded.ninja/gluster/blk0/ >> >> There are 4 cores here, 3 pointing to the (by now hopefully) famous bug >> #1195415. One of the cores exhibit a different stack etc. Need more analysis >> to see what the issue could be here, core file: core.16937 >> >>> http://ded.ninja/gluster/blk1/ >> >> There is a single core here, pointing to the above bug again. > > Both the blk0 and blk1 VM's are still online and available, > if that's helpful? > > If not, please let me know and I'll nuke them. :) I'm ok to nuke both those VM's, yeah? + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 1 Apr 2015, at 18:19, Kaleb S. KEITHLEY wrote: > On 04/01/2015 12:24 PM, Ravishankar N wrote: >> >> I found it easier to draw a picture of what I had in mind, especially >> the arrow mark thingy. You can view it here: >> https://github.com/itisravi/image/blob/master/gluster.jpg >> So what the image is trying to convey (hopefully) is "Gluster: Software >> Defined Storage. Redefined" > > That's clever. I like it. Yeah, works for me too. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 04/01/2015 12:24 PM, Ravishankar N wrote: I found it easier to draw a picture of what I had in mind, especially the arrow mark thingy. You can view it here: https://github.com/itisravi/image/blob/master/gluster.jpg So what the image is trying to convey (hopefully) is "Gluster: Software Defined Storage. Redefined" That's clever. I like it. -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] crypt xlator bug
On 1 Apr 2015, at 10:57, Emmanuel Dreyfus wrote: > Hi > > crypt.t was recently broken in NetBSD regression. The glusterfs returns > a node with file type invalid to FUSE, and that breaks the test. > > After running a git bisect, I found the offending commit after which > this behavior appeared: >8a2e2b88fc21dc7879f838d18cd0413dd88023b7 >mem-pool: invalidate memory on GF_FREE to aid debugging > > This means the bug has always been there, but this debugging aid > caused it to be reliable. Sounds like that commit is a good win then. :) Harsha/Pranith/Lala, your names are on the git blame for crypt.c... any ideas? :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
Delivering your data in your way 2015-04-01 18:02 GMT+03:00 Justin Clift : > On 1 Apr 2015, at 13:14, Tom Callaway wrote: > > Hello Gluster Ant People! > > > > Right now, if you go to gluster.org, you see our current slogan in > giant text: > > > > Write once, read everywhere > > > > However, no one seems to be super-excited about that slogan. It doesn't > really help differentiate gluster from a portable hard drive or a paperback > book. I am going to work with Red Hat's branding geniuses to come up with > some possibilities, but sometimes, the best ideas come from the people > directly involved with a project. > > > > What I am saying is that if you have a slogan idea for Gluster, I want > to hear it. You can reply on list or send it to me directly. I will collect > all the proposals (yours and the ones that Red Hat comes up with) and > circle back around for community discussion in about a month or so. > > Gluster: "Scale out your data. Safely. :)" > > -- > GlusterFS - http://www.gluster.org > > An open source, distributed file system scaling to several > petabytes, and handling thousands of clients. > > My personal twitter: twitter.com/realjustinclift > > ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > -- Best regards, Roman. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
Justin Clift wrote: > We need some kind of solution. What about ading another nb7build user in gerrit? That way results will not conflict. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
yep, using on Jenkins the plugin Gerrit Trigger, this plugin trigger all requests for all repositories and all branches, this function running with Auto QA tests and vote with ACL "verified", for example: https://gerrit.ovirt.org/#/c/37886/ my intention is make one deploy with one complete example the use this, comming soon :D firemanxbr On Wed, Apr 1, 2015 at 4:32 AM, Emmanuel Dreyfus wrote: > On Wed, Apr 01, 2015 at 12:50:30PM +0530, Vijay Bellur wrote: > > I think all we need is to create another gerrit user like "NetBSD > > Regression" or so and have verified votes routed through this user. Just > as > > multiple users can provide distinct CR votes, we can have multiple build > > systems provide distinct Verified votes. > > Yes, plase do that: a nb7build user like we have a build user. It can have > the same .ssh/authorized_keys as build user. > > -- > Emmanuel Dreyfus > m...@netbsd.org > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 1 Apr 2015, at 17:20, Marcelo Barbosa wrote: > yep, using on Jenkins the plugin Gerrit Trigger, this plugin trigger all > requests for all repositories and all branches, this function running with > Auto QA tests and vote with ACL "verified", for example: > >https://gerrit.ovirt.org/#/c/37886/ > > my intention is make one deploy with one complete example the use this, > comming soon :D We already use the Gerrit Trigger plugin extensively. :) Now we need to get more advanced with it... In our existing setup, we have bunch of fast tests that run automatically (triggered), and which vote. We use this for initial smoke testing to locate (fail on) obviously problems quickly. We also have a much more in-depth CentOS 6.x regression test (~2hours run time) that's triggered. It doesn't vote using the Gerrit trigger method. Instead it calls back via ssh to Gerrit, indicating SUCCESS or FAILURE. That status goes into a column in the Gerrit CR, to show if it passed the regression test or not. Now... we have a NetBSD 7.x regression test (~2 hour run time) that's also triggered. We need to find a way for this to communicate back to Gerrit. We've tried using the same method used for our CentOS 6.x communication back, but the status results for the NetBSD tests conflict with the CentOS status results. We need some kind of solution. Hoping you have ideas? :) We're ok with pretty much anything that works and can be setup ASAP. Like today. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 04/01/2015 05:44 PM, Tom Callaway wrote: What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. I found it easier to draw a picture of what I had in mind, especially the arrow mark thingy. You can view it here: https://github.com/itisravi/image/blob/master/gluster.jpg So what the image is trying to convey (hopefully) is "Gluster: Software Defined Storage. Redefined" -Ravi N.B. Please excuse my bad handwriting in the image :) ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 04/01/2015 08:01 AM, Jeff Darcy wrote: > GlusterFS: your data, your way Has a nice ring to it. -- Glenn Holmer (Linux registered user #16682) "After the vintage season came the aftermath -- and Cenbe." ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] NetBSD regression broken again
Hi It did not took so long: Some comit broken NetBSD regresssion again, less than 24 hours after we fixed it: quota-anon-fd-nfs.t now always fail. I cannot push infinite time into keeping this in good shape. We need to have NetBSD regression taken into account and avoid merging patches that break it. Alternatively you can decide portability is not desirable and we get rid of it, but the current situation is not sustainable. If we want to keep it: anyone can look at what is wrong with quota-anon-fd-nfs.t? -- Emmanuel Dreyfus m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Got a slogan idea?
On 1 Apr 2015, at 13:14, Tom Callaway wrote: > Hello Gluster Ant People! > > Right now, if you go to gluster.org, you see our current slogan in giant text: > > Write once, read everywhere > > However, no one seems to be super-excited about that slogan. It doesn't > really help differentiate gluster from a portable hard drive or a paperback > book. I am going to work with Red Hat's branding geniuses to come up with > some possibilities, but sometimes, the best ideas come from the people > directly involved with a project. > > What I am saying is that if you have a slogan idea for Gluster, I want to > hear it. You can reply on list or send it to me directly. I will collect all > the proposals (yours and the ones that Red Hat comes up with) and circle back > around for community discussion in about a month or so. Gluster: "Scale out your data. Safely. :)" -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 04/01/2015 06:33 PM, James Cuff wrote: GlusterFS: It's not DevNull(tm) https://rc.fas.harvard.edu/news-home/feature-stories/fas-research-computing-implements-novel-big-data-storage-system/ April 1 seems like a good day to join the DevNull (tm) development team! Best, j. -- dr. james cuff, assistant dean for research computing, harvard university | division of science | thirty eight oxford street, cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu On Wed, Apr 1, 2015 at 9:01 AM, Jeff Darcy wrote: What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. Personally I don't like any of these all that much, but maybe they'll get someone else thinking. GlusterFS: your data, your way GlusterFS: any data, any servers, any protocol GlusterFS: scale-out storage for everyone GlusterFS: software defined storage for everyone GlusterFS: the Swiss Army Knife of storage ___ Gluster-users mailing list gluster-us...@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
GlusterFS: It's not DevNull(tm) https://rc.fas.harvard.edu/news-home/feature-stories/fas-research-computing-implements-novel-big-data-storage-system/ Best, j. -- dr. james cuff, assistant dean for research computing, harvard university | division of science | thirty eight oxford street, cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu On Wed, Apr 1, 2015 at 9:01 AM, Jeff Darcy wrote: >> What I am saying is that if you have a slogan idea for Gluster, I want >> to hear it. You can reply on list or send it to me directly. I will >> collect all the proposals (yours and the ones that Red Hat comes up >> with) and circle back around for community discussion in about a month >> or so. > > Personally I don't like any of these all that much, but maybe they'll > get someone else thinking. > > GlusterFS: your data, your way > > GlusterFS: any data, any servers, any protocol > > GlusterFS: scale-out storage for everyone > > GlusterFS: software defined storage for everyone > > GlusterFS: the Swiss Army Knife of storage > ___ > Gluster-users mailing list > gluster-us...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 04/01/2015 06:59 PM, Vijay Bellur wrote: On 04/01/2015 05:44 PM, Tom Callaway wrote: Hello Gluster Ant People! Right now, if you go to gluster.org, you see our current slogan in giant text: Write once, read everywhere However, no one seems to be super-excited about that slogan. It doesn't really help differentiate gluster from a portable hard drive or a paperback book. I am going to work with Red Hat's branding geniuses to come up with some possibilities, but sometimes, the best ideas come from the people directly involved with a project. What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. I also think that we should start calling ourselves Gluster or GlusterDS (Gluster Distributed Storage) instead of GlusterFS by default. We are certainly not file storage only, we have object, api & block interfaces too and the FS in GlusterFS seems to imply a file storage connotation alone. -Vijay +1 for Gluster. I like GlusterDS too (in the present context). But Gluster is more simplistic, generic and not oriented towards any particular type of storage. Which will be better for long term IMO (in case Gluster becomes a clustered storage in distant future). -Lala ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On Wed, Apr 01, 2015 at 06:59:15PM +0530, Vijay Bellur wrote: > On 04/01/2015 05:44 PM, Tom Callaway wrote: > >Hello Gluster Ant People! > > > >Right now, if you go to gluster.org, you see our current slogan in giant > >text: > > > >Write once, read everywhere > > > >However, no one seems to be super-excited about that slogan. It doesn't > >really help differentiate gluster from a portable hard drive or a > >paperback book. I am going to work with Red Hat's branding geniuses to > >come up with some possibilities, but sometimes, the best ideas come from > >the people directly involved with a project. > > > >What I am saying is that if you have a slogan idea for Gluster, I want > >to hear it. You can reply on list or send it to me directly. I will > >collect all the proposals (yours and the ones that Red Hat comes up > >with) and circle back around for community discussion in about a month > >or so. > > > > I also think that we should start calling ourselves Gluster or GlusterDS > (Gluster Distributed Storage) instead of GlusterFS by default. We are > certainly not file storage only, we have object, api & block interfaces too > and the FS in GlusterFS seems to imply a file storage connotation alone. My preference goes to Gluster, to capture the whole community. A rename of the main project (GlusterFS) might be in order, for that GlusterDS sounds suitable to me. Thanks, Niels pgpgQHHN6y0jp.pgp Description: PGP signature ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Got a slogan idea?
GlusterFS: Simple Scale-out Storage GlusterFS: Simplest Scale-out Storage GlusterFS: Storage Made For Scale GlusterFS: Storage For Scale On Wed, Apr 1, 2015 at 6:38 PM, Kaleb S. KEITHLEY wrote: > On 04/01/2015 09:01 AM, Jeff Darcy wrote: > >> What I am saying is that if you have a slogan idea for Gluster, I want >>> to hear it. You can reply on list or send it to me directly. I will >>> collect all the proposals (yours and the ones that Red Hat comes up >>> with) and circle back around for community discussion in about a month >>> or so. >>> >> >> Personally I don't like any of these all that much, but maybe they'll >> get someone else thinking. >> >> GlusterFS: your data, your way >> >> GlusterFS: any data, any servers, any protocol >> >> GlusterFS: scale-out storage for everyone >> >> GlusterFS: software defined storage for everyone >> >> GlusterFS: the Swiss Army Knife of storage >> > > > GlusterFS: Storage Made Simple > > or > > GlusterFS: Scale-out Storage Made Simple > > -- > > Kaleb > > ___ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] Got a slogan idea?
On 04/01/2015 05:44 PM, Tom Callaway wrote: Hello Gluster Ant People! Right now, if you go to gluster.org, you see our current slogan in giant text: Write once, read everywhere However, no one seems to be super-excited about that slogan. It doesn't really help differentiate gluster from a portable hard drive or a paperback book. I am going to work with Red Hat's branding geniuses to come up with some possibilities, but sometimes, the best ideas come from the people directly involved with a project. What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. I also think that we should start calling ourselves Gluster or GlusterDS (Gluster Distributed Storage) instead of GlusterFS by default. We are certainly not file storage only, we have object, api & block interfaces too and the FS in GlusterFS seems to imply a file storage connotation alone. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Got a slogan idea?
On 04/01/2015 09:01 AM, Jeff Darcy wrote: What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. Personally I don't like any of these all that much, but maybe they'll get someone else thinking. GlusterFS: your data, your way GlusterFS: any data, any servers, any protocol GlusterFS: scale-out storage for everyone GlusterFS: software defined storage for everyone GlusterFS: the Swiss Army Knife of storage GlusterFS: Storage Made Simple or GlusterFS: Scale-out Storage Made Simple -- Kaleb ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Got a slogan idea?
> What I am saying is that if you have a slogan idea for Gluster, I want > to hear it. You can reply on list or send it to me directly. I will > collect all the proposals (yours and the ones that Red Hat comes up > with) and circle back around for community discussion in about a month > or so. Personally I don't like any of these all that much, but maybe they'll get someone else thinking. GlusterFS: your data, your way GlusterFS: any data, any servers, any protocol GlusterFS: scale-out storage for everyone GlusterFS: software defined storage for everyone GlusterFS: the Swiss Army Knife of storage ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] Got a slogan idea?
Hello Gluster Ant People! Right now, if you go to gluster.org, you see our current slogan in giant text: Write once, read everywhere However, no one seems to be super-excited about that slogan. It doesn't really help differentiate gluster from a portable hard drive or a paperback book. I am going to work with Red Hat's branding geniuses to come up with some possibilities, but sometimes, the best ideas come from the people directly involved with a project. What I am saying is that if you have a slogan idea for Gluster, I want to hear it. You can reply on list or send it to me directly. I will collect all the proposals (yours and the ones that Red Hat comes up with) and circle back around for community discussion in about a month or so. Thanks! ~tom == Red Hat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Review request for patch: libglusterfs/syncop: Add xdata to all syncop calls
> Is it always ok to consider xdata to be valid, even if op_ret < 0? I would say yes, simply because it's useful. For example, xdata might contain extra information explaining the error, or suggesting a next step. In NSR, when a non-coordinator receives a request it returns EREMOTE. It would be handy for it to return something in xdata to identify which brick *is* the coordinator, so the client doesn't have to guess. It might also be useful to return a transaction ID, even on error, so that a subsequent retry can be detected as such. If xdata is discarded on error, then both the "regardless of error" and "only on error" use cases aren't satisfied. > If yes, I will have to update the syncop_*_cbk calls to ref xdata > if they exist, irrespective of op_ret. > > Also, it can be used in really cool ways, like we can have a > key called glusterfs.error_origin_xlator set to this->name of the xlator > where error originated and master xlators (fuse and gfapi) can log / make > use of it etc. Great minds think alike. ;) There might even be cases where it would be useful to capture tracebacks or statistics to send back with an error. ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.6.3beta2 released
Hi glusterfs-3.6.3beta2 has been released and can be found here. http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.3beta2/ This beta release supposedly fixes the bugs listed below since 3.6.3beta1 was made available. Thanks to all who submitted the patches, reviewed the changes. 1187526 - Disperse volume mounted through NFS doesn't list any files/directories 1188471 - When the volume is in stopped state/all the bricks are down mount of the volume hangs 1201484 - glusterfs-3.6.2 fails to build on Ubuntu Precise: 'RDMA_OPTION_ID_REUSEADDR' undeclared 1202212 - Performance enhancement for RDMA 1189023 - Directories not visible anymore after add-brick, new brick dirs not part of old bricks 1202673 - Perf: readdirp in replicated volumes causes performance degrade 1203081 - Entries in indices/xattrop directory not removed appropriately 1203648 - Quota: Build ancestry in the lookup 1199936 - readv on /var/run/6b8f1f2526c6af8a87f1bb611ae5a86f.socket failed when NFS is disabled 1200297 - cli crashes when listing quota limits with xml output 1201622 - Convert quota size from n-to-h order before using it 1194141 - AFR : failure in self-heald.t 1201624 - Spurious failure of tests/bugs/quota/bug-1038598.t 1194306 - Do not count files which did not need index heal in the first place as successfully healed 1200258 - Quota: features.quota-deem-statfs is "on" even after disabling quota. 1165938 - Fix regression test spurious failures 1197598 - NFS logs are filled with system.posix_acl_access messages 1199577 - mount.glusterfs uses /dev/stderr and fails if the device does not exist 1197598 - NFS logs are filled with system.posix_acl_access messages 1188066 - logging improvements in marker translator 1191537 - With afrv2 + ext4, lookups on directories with large offsets could result in duplicate/missing entries 1165129 - libgfapi: use versioned symbols in libgfapi.so for compatibility 1179136 - glusterd: Gluster rebalance status returns failure 1176756 - glusterd: remote locking failure when multiple synctask transactions are run 1188064 - log files get flooded when removexattr() can't find a specified key or value 1165938 - Fix regression test spurious failures 1192522 - index heal doesn't continue crawl on self-heal failure 1193970 - Fix spurious ssl-authz.t regression failure (backport) Regards, Raghavendra Bhat ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] REMINDER: Weekly Gluster Community meeting today at 12:00 UTC
Hi All, In about 80 minutes from now we will have the regular weekly Gluster Community meeting. Meeting details: - location: #gluster-meeting on Freenode IRC - date: every Wednesday - time: 8:00 EDT, 12:00 UTC, 13:00 CET, 17:30 IST (in your terminal, run: date -d "12:00 UTC") - agenda: available at [1] Currently the following items are listed: * Roll Call * Status of last week's action items * Gluster 3.6 * Gluster 3.5 * Gluster 3.4 * Gluster Next * Open Floor - docs - Awesum Web Presence - Gluster Summit Barcelona, second week in May - Gluster Winter of Code - Static Analysis results The last topic has space for additions. If you have a suitable topic to discuss, please add it to the agenda. Thanks, Vijay [1] https://public.pad.fsfe.org/p/gluster-community-meetings ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Review request for patch: libglusterfs/syncop: Add xdata to all syncop calls
On Tuesday 31 March 2015 09:36 PM, Raghavendra Talur wrote: Hi, I have sent updated patch which adds xdata support to all syncop calls. It adds xdata in both request and response path of syncop. Considering that this patch has changes in many files, I request a quick review and merge to avoid rebase issues. Patch link http://review.gluster.org/#/c/9859/ Bug Id: https://bugzilla.redhat.com/show_bug.cgi?id=1158621 Thanks, Raghavendra Talur Question regarding validity of xdata when op_ret < 0. In this patch set I have syncop_*_cbk in this form int syncop_rmdir_cbk (call_frame_t *frame, void *cookie, xlator_t *this, int op_ret, int op_errno, struct iatt *preparent, struct iatt *postparent, dict_t *xdata) { struct syncargs *args = NULL; args = cookie; args->op_ret = op_ret; args->op_errno = op_errno; if (op_ret >= 0) { if (xdata) args->xdata = dict_ref (xdata); } __wake (args); return 0; } where as the call stub has it like this call_stub_t * fop_rmdir_cbk_stub (call_frame_t *frame, fop_rmdir_cbk_t fn, int32_t op_ret, int32_t op_errno, struct iatt *preparent, struct iatt *postparent, dict_t *xdata) { call_stub_t *stub = NULL; GF_VALIDATE_OR_GOTO ("call-stub", frame, out); stub = stub_new (frame, 0, GF_FOP_RMDIR); GF_VALIDATE_OR_GOTO ("call-stub", stub, out); stub->fn_cbk.rmdir = fn; stub->args_cbk.op_ret = op_ret; stub->args_cbk.op_errno = op_errno; if (preparent) stub->args_cbk.preparent = *preparent; if (postparent) stub->args_cbk.postparent = *postparent; if (xdata) stub->args_cbk.xdata = dict_ref (xdata); out: return stub; } The difference being when xdata is considered to be valid. call-stub considers it valid irrespective of op_ret value. Is it always ok to consider xdata to be valid, even if op_ret < 0? If yes, I will have to update the syncop_*_cbk calls to ref xdata if they exist, irrespective of op_ret. Also, it can be used in really cool ways, like we can have a key called glusterfs.error_origin_xlator set to this->name of the xlator where error originated and master xlators (fuse and gfapi) can log / make use of it etc. Thanks, Raghavendra Talur ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] NFSv4 xattr support IETF working group proposal
FYI. There is a new proposal on NFSv4 xattr support that is being debated upon in the IETF working group. If anyone has any thoughts/comments/concerns, now would be a good time to express them in the IETF NFSv4 working group ML. As I understand it, the authors are also looking for any strong use-cases that can be added to the draft as it moves towards standardization. https://tools.ietf.org/html/draft-naik-nfsv4-xattrs-01 Thanks, Anand ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] crypt xlator bug
Hi crypt.t was recently broken in NetBSD regression. The glusterfs returns a node with file type invalid to FUSE, and that breaks the test. After running a git bisect, I found the offending commit after which this behavior appeared: 8a2e2b88fc21dc7879f838d18cd0413dd88023b7 mem-pool: invalidate memory on GF_FREE to aid debugging This means the bug has always been there, but this debugging aid caused it to be reliable. With the help of an assertion, I can detect when inode->ia_type gets a corrupted value. It gives me this backtrace where in frame 4, inode = 0xb9611880 and inode->ia_type = 12475 (which is wrong). inode value comes from FUSE state->loc->inode and we get it from frame 20 which is in crypt.c: #4 0xb9bd2adf in mdc_inode_iatt_get (this=0xbb1df030, inode=0xb9611880, iatt=0xbf7fdfa0) at md-cache.c:471 #5 0xb9bd34e1 in mdc_lookup (frame=0xb9aa82b0, this=0xbb1df030, loc=0xb9608840, xdata=0x0) at md-cache.c:847 #6 0xb9bc216e in io_stats_lookup (frame=0xb9aa8200, this=0xbb1e0030, loc=0xb9608840, xdata=0x0) at io-stats.c:1934 #7 0xbb76755f in default_lookup (frame=0xb9aa8200, this=0xbb1d0030, loc=0xb9608840, xdata=0x0) at defaults.c:2138 #8 0xb9ba69cd in meta_lookup (frame=0xb9aa8200, this=0xbb1d0030, loc=0xb9608840, xdata=0x0) at meta.c:49 #9 0xbb277365 in fuse_lookup_resume (state=0xb9608830) at fuse-bridge.c:607 #10 0xbb276e07 in fuse_fop_resume (state=0xb9608830) at fuse-bridge.c:569 #11 0xbb274969 in fuse_resolve_done (state=0xb9608830) at fuse-resolve.c:644 #12 0xbb274a29 in fuse_resolve_all (state=0xb9608830) at fuse-resolve.c:671 #13 0xbb274941 in fuse_resolve (state=0xb9608830) at fuse-resolve.c:635 #14 0xbb274a06 in fuse_resolve_all (state=0xb9608830) at fuse-resolve.c:667 #15 0xbb274a8e in fuse_resolve_continue (state=0xb9608830) at fuse-resolve.c:687 #16 0xbb2731f4 in fuse_resolve_entry_cbk (frame=0xb9609688, cookie=0xb96140a0, this=0xbb193030, op_ret=0, op_errno=0, inode=0xb9611880, buf=0xb961e558, xattr=0xbb18a1a0, postparent=0xb961e628) at fuse-resolve.c:81 #17 0xb9bbd0c1 in io_stats_lookup_cbk (frame=0xb96140a0, cookie=0xb9614150, this=0xbb1e0030, op_ret=0, op_errno=0, inode=0xb9611880, buf=0xb961e558, xdata=0xbb18a1a0, postparent=0xb961e628) at io-stats.c:1512 #18 0xb9bd33ff in mdc_lookup_cbk (frame=0xb9614150, cookie=0xb9614410, this=0xbb1df030, op_ret=0, op_errno=0, inode=0xb9611880, stbuf=0xb961e558, dict=0xbb18a1a0, postparent=0xb961e628) at md-cache.c:816 #19 0xb9be2b10 in ioc_lookup_cbk (frame=0xb9614410, cookie=0xb96144c0, this=0xbb1de030, op_ret=0, op_errno=0, inode=0xb9611880, stbuf=0xb961e558, xdata=0xbb18a1a0, postparent=0xb961e628) at io-cache.c:260 #20 0xbb227fb5 in load_file_size (frame=0xb96144c0, cookie=0xb9aa8200, this=0xbb1db030, op_ret=0, op_errno=0, dict=0xbb18a470, xdata=0x0) at crypt.c:3830 In frame 20: case GF_FOP_LOOKUP: STACK_UNWIND_STRICT(lookup, frame, op_ret, op_errno, op_ret >= 0 ? local->inode : NULL, op_ret >= 0 ? &local->buf : NULL, local->xdata, op_ret >= 0 &local->postbuf : NULL); Here is the problem, local->inode is not the 0xb9611880 value anymore, which means local got corrupted: (gdb) print local->inode $2 = (inode_t *) 0x1db030de I now suspect local has been freed, but I do not find where in crypt.c this operation is done. There is a local = mem_get0(this->local_pool) in crypt_alloc_local, but where is that structure freed? There is no mem_put() call in crypt xlator. -- Emmanuel Dreyfus m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Improving Geo-replication Status and Checkpoints
On 04/01/2015 02:50 PM, Sahina Bose wrote: On 04/01/2015 02:30 PM, Aravinda wrote: Hi, In each node of Master Cluster one Monitor process and one or more worker process for each brick in that node. Monitor will have status file, which will be updated by glusterd. Possible Status values in monitor_status file are Created, Started, Paused, Stopped. Geo-rep can not be paused if monitor status is not "Started". Based on monitor_status, we need to hide other information from brick status file from showing it to user. For example, If monitor status is "Stopped", it will not make sense to show "Crawl Status" in Geo-rep Status output. Created a Matrix of possible status values based on Status of Monitor. VALUE represents actual unchanged value from Brick status file. Monitor Status --->CreatedStarted Paused Stopped -- sessionVALUE VALUE VALUE VALUE brick VALUE VALUE VALUE VALUE node VALUE VALUE VALUE VALUE node_uuid VALUE VALUE VALUE VALUE volume VALUE VALUE VALUE VALUE slave_user VALUE VALUE VALUE VALUE slave_node N/AVALUE VALUE N/A status CreatedVALUE Paused Stopped last_syncedN/AVALUE VALUE VALUE crawl_status N/AVALUE N/A N/A entry N/AVALUE N/A N/A data N/AVALUE N/A N/A meta N/AVALUE N/A N/A failures N/AVALUE VALUE VALUE checkpoint_completed N/AVALUE VALUE VALUE checkpoint_timeN/AVALUE VALUE VALUE checkpoint_completed_time N/AVALUE VALUE VALUE Where: session - only in XML output, Complete session URL which is used in Create command brick - Master Brick Node node - Master Node node_uuid - Master Node UUID, Only in XML output volume - Master Volume slave_user - Slave User slave_node - Slave node to which respective master worker is connected. status - Created/Initializing../Active/Passive/Faulty/Paused/Stopped last_synced - Last synced Time crawl_status - Hybrid/History/Changelog entry - Number of entry ops pending(per session, resets counter if worker restart) data - Number of data ops pending(per session, resets counter if worker restart) meta - Number of meta ops pending(per session, resets counter if worker restart) failures - Number of failures. (If count more than 0, then action item for admin to look in log files) checkpoint_completed - Checkpoint Status Yes/No/ N/A checkpoint_time - Checkpoint Set time or N/A checkpoint_completed_time - Checkpoint Completed Time or N/A Along with the monitor_status, if brick status is Faulty, following fields will be displayed as N/A. active, paused, slave_node, crawl_status, entry, data, metadata Some questions - * Would monitor status also have "Initializing" state? Monitor status is only internal to Geo-replication, Status output will not have monitor_status. * What's the difference between brick and master node above? Brick - Brick path as shown in volume info, Node: Hostname as shown in Volume info * Is the last_synced time returned in UTC? TBD. * active and paused - are these fields or status values? Status will be Paused irrespective of Active/Passive. Let me know your thoughts. -- regards Aravinda On 02/03/2015 11:00 PM, Aravinda wrote: Today we discussed about Geo-rep Status design, summary of the discussion. - No usecase for "Deletes pending" column, should we retain it? - No separate column for Active/Passive. Worker can be Active/Passive only when worker is Stable(It can't be Faulty and Active) - Rename "Not Started" status as "Created" - Checkpoint columns will be retained in the Status output till we support Multiple checkpoints. Three columns instead of Single column(Completed, Checkpoint time and Completion time) - Still we have confusion about "Files Pending" and "Files Synced", What numbers it has to show. Georep can't map the number to exact count on disk. Venky suggested to show Entry, Data and Metadata pending as three columns. (Remove "Files Pending" and "Files Synced") - Rename "Files Skipped" to "Failures" Status output proposed: --- MASTER NODE - Master node hostname/IP MASTER VOL - Master volume name MASTER BRICK - Master brick path SLAVE USER - Slave user to which geo-rep is established. SLAVE - Slave host and Volume name(HOST::VOL format) STATUS - Created/Initializing../Started/Active/Passive/Stopped/Faulty LAST SYNCED - Last synced time(Based on stime xattr) CRAWL STATUS - Hybrid/History/Changelog CHECKPOINT STATUS - Yes/No/ N/A CHECKPOINT TIME - Checkpoint Set Time CHECKPOIN
Re: [Gluster-devel] Improving Geo-replication Status and Checkpoints
On 04/01/2015 02:30 PM, Aravinda wrote: Hi, In each node of Master Cluster one Monitor process and one or more worker process for each brick in that node. Monitor will have status file, which will be updated by glusterd. Possible Status values in monitor_status file are Created, Started, Paused, Stopped. Geo-rep can not be paused if monitor status is not "Started". Based on monitor_status, we need to hide other information from brick status file from showing it to user. For example, If monitor status is "Stopped", it will not make sense to show "Crawl Status" in Geo-rep Status output. Created a Matrix of possible status values based on Status of Monitor. VALUE represents actual unchanged value from Brick status file. Monitor Status --->CreatedStarted Paused Stopped -- sessionVALUE VALUE VALUE VALUE brick VALUE VALUE VALUE VALUE node VALUE VALUE VALUE VALUE node_uuid VALUE VALUE VALUE VALUE volume VALUE VALUE VALUE VALUE slave_user VALUE VALUE VALUE VALUE slave_node N/AVALUE VALUE N/A status CreatedVALUE Paused Stopped last_syncedN/AVALUE VALUE VALUE crawl_status N/AVALUE N/A N/A entry N/AVALUE N/A N/A data N/AVALUE N/A N/A meta N/AVALUE N/A N/A failures N/AVALUE VALUE VALUE checkpoint_completed N/AVALUE VALUE VALUE checkpoint_timeN/AVALUE VALUE VALUE checkpoint_completed_time N/AVALUE VALUE VALUE Where: session - only in XML output, Complete session URL which is used in Create command brick - Master Brick Node node - Master Node node_uuid - Master Node UUID, Only in XML output volume - Master Volume slave_user - Slave User slave_node - Slave node to which respective master worker is connected. status - Created/Initializing../Active/Passive/Faulty/Paused/Stopped last_synced - Last synced Time crawl_status - Hybrid/History/Changelog entry - Number of entry ops pending(per session, resets counter if worker restart) data - Number of data ops pending(per session, resets counter if worker restart) meta - Number of meta ops pending(per session, resets counter if worker restart) failures - Number of failures. (If count more than 0, then action item for admin to look in log files) checkpoint_completed - Checkpoint Status Yes/No/ N/A checkpoint_time - Checkpoint Set time or N/A checkpoint_completed_time - Checkpoint Completed Time or N/A Along with the monitor_status, if brick status is Faulty, following fields will be displayed as N/A. active, paused, slave_node, crawl_status, entry, data, metadata Some questions - * Would monitor status also have "Initializing" state? * What's the difference between brick and master node above? * Is the last_synced time returned in UTC? * active and paused - are these fields or status values? Let me know your thoughts. -- regards Aravinda On 02/03/2015 11:00 PM, Aravinda wrote: Today we discussed about Geo-rep Status design, summary of the discussion. - No usecase for "Deletes pending" column, should we retain it? - No separate column for Active/Passive. Worker can be Active/Passive only when worker is Stable(It can't be Faulty and Active) - Rename "Not Started" status as "Created" - Checkpoint columns will be retained in the Status output till we support Multiple checkpoints. Three columns instead of Single column(Completed, Checkpoint time and Completion time) - Still we have confusion about "Files Pending" and "Files Synced", What numbers it has to show. Georep can't map the number to exact count on disk. Venky suggested to show Entry, Data and Metadata pending as three columns. (Remove "Files Pending" and "Files Synced") - Rename "Files Skipped" to "Failures" Status output proposed: --- MASTER NODE - Master node hostname/IP MASTER VOL - Master volume name MASTER BRICK - Master brick path SLAVE USER - Slave user to which geo-rep is established. SLAVE - Slave host and Volume name(HOST::VOL format) STATUS - Created/Initializing../Started/Active/Passive/Stopped/Faulty LAST SYNCED - Last synced time(Based on stime xattr) CRAWL STATUS - Hybrid/History/Changelog CHECKPOINT STATUS - Yes/No/ N/A CHECKPOINT TIME - Checkpoint Set Time CHECKPOINT COMPLETED - Checkpoint Completion Time Not yet decided --- FILES SYNCD - Number of Files Synced FILES PENDING - Number of Files Pending DELETES PENDING- Number of Deletes Pending FILES SKIPPED - Number of Files skip
Re: [Gluster-devel] Improving Geo-replication Status and Checkpoints
Hi Aravinda, Looks good to me. Thanks and Regards, Kotresh H R - Original Message - > From: "Aravinda" > To: "Gluster Devel" > Sent: Wednesday, April 1, 2015 2:30:49 PM > Subject: Re: [Gluster-devel] Improving Geo-replication Status and Checkpoints > > Hi, > > In each node of Master Cluster one Monitor process and one or more > worker process for each brick in that node. > Monitor will have status file, which will be updated by glusterd. > Possible Status values in monitor_status file are Created, Started, > Paused, Stopped. > > Geo-rep can not be paused if monitor status is not "Started". > > Based on monitor_status, we need to hide other information from brick > status file from showing it to user. For example, If monitor status is > "Stopped", it will not make sense to show "Crawl Status" in Geo-rep > Status output. Created a Matrix of possible status values based on > Status of Monitor. VALUE represents actual unchanged value from Brick > status file. > > Monitor Status --->CreatedStarted Paused Stopped > -- > sessionVALUE VALUE VALUE VALUE > brick VALUE VALUE VALUE VALUE > node VALUE VALUE VALUE VALUE > node_uuid VALUE VALUE VALUE VALUE > volume VALUE VALUE VALUE VALUE > slave_user VALUE VALUE VALUE VALUE > slave_node N/AVALUE VALUE N/A > status CreatedVALUE Paused Stopped > last_syncedN/AVALUE VALUE VALUE > crawl_status N/AVALUE N/A N/A > entry N/AVALUE N/A N/A > data N/AVALUE N/A N/A > meta N/AVALUE N/A N/A > failures N/AVALUE VALUE VALUE > checkpoint_completed N/AVALUE VALUE VALUE > checkpoint_timeN/AVALUE VALUE VALUE > checkpoint_completed_time N/AVALUE VALUE VALUE > > Where: > session - only in XML output, Complete session URL which is used in > Create command > brick - Master Brick Node > node - Master Node > node_uuid - Master Node UUID, Only in XML output > volume - Master Volume > slave_user - Slave User > slave_node - Slave node to which respective master worker is connected. > status - Created/Initializing../Active/Passive/Faulty/Paused/Stopped > last_synced - Last synced Time > crawl_status - Hybrid/History/Changelog > entry - Number of entry ops pending(per session, resets counter if > worker restart) > data - Number of data ops pending(per session, resets counter if worker > restart) > meta - Number of meta ops pending(per session, resets counter if worker > restart) > failures - Number of failures. (If count more than 0, then action item > for admin to look in log files) > checkpoint_completed - Checkpoint Status Yes/No/ N/A > checkpoint_time - Checkpoint Set time or N/A > checkpoint_completed_time - Checkpoint Completed Time or N/A > > Along with the monitor_status, if brick status is Faulty, following > fields will be displayed as N/A. > active, paused, slave_node, crawl_status, entry, data, metadata > > Let me know your thoughts. > > -- > regards > Aravinda > > > On 02/03/2015 11:00 PM, Aravinda wrote: > > Today we discussed about Geo-rep Status design, summary of the > > discussion. > > > > - No usecase for "Deletes pending" column, should we retain it? > > - No separate column for Active/Passive. Worker can be Active/Passive > > only when worker is Stable(It can't be Faulty and Active) > > - Rename "Not Started" status as "Created" > > - Checkpoint columns will be retained in the Status output till we > > support Multiple checkpoints. Three columns instead of Single > > column(Completed, Checkpoint time and Completion time) > > - Still we have confusion about "Files Pending" and "Files Synced", > > What numbers it has to show. Georep can't map the number to exact > > count on disk. > > Venky suggested to show Entry, Data and Metadata pending as three > > columns. (Remove "Files Pending" and "Files Synced") > > - Rename "Files Skipped" to "Failures" > > > > Status output proposed: > > --- > > MASTER NODE - Master node hostname/IP > > MASTER VOL - Master volume name > > MASTER BRICK - Master brick path > > SLAVE USER - Slave user to which geo-rep is established. > > SLAVE - Slave host and Volume name(HOST::VOL format) > > STATUS - Created/Initializing../Started/Active/Passive/Stopped/Faulty > > LAST SYNCED - Last synced time(Based on stime xattr) > > CRAWL STATUS - Hybrid/History/Changelog > > CHECKPOINT STATUS - Yes/No/ N/A > > CHECKPOINT TIME - Checkpoint Set Time > > CHECKPOINT CO
Re: [Gluster-devel] Improving Geo-replication Status and Checkpoints
Hi, In each node of Master Cluster one Monitor process and one or more worker process for each brick in that node. Monitor will have status file, which will be updated by glusterd. Possible Status values in monitor_status file are Created, Started, Paused, Stopped. Geo-rep can not be paused if monitor status is not "Started". Based on monitor_status, we need to hide other information from brick status file from showing it to user. For example, If monitor status is "Stopped", it will not make sense to show "Crawl Status" in Geo-rep Status output. Created a Matrix of possible status values based on Status of Monitor. VALUE represents actual unchanged value from Brick status file. Monitor Status --->CreatedStarted Paused Stopped -- sessionVALUE VALUE VALUE VALUE brick VALUE VALUE VALUE VALUE node VALUE VALUE VALUE VALUE node_uuid VALUE VALUE VALUE VALUE volume VALUE VALUE VALUE VALUE slave_user VALUE VALUE VALUE VALUE slave_node N/AVALUE VALUE N/A status CreatedVALUE Paused Stopped last_syncedN/AVALUE VALUE VALUE crawl_status N/AVALUE N/A N/A entry N/AVALUE N/A N/A data N/AVALUE N/A N/A meta N/AVALUE N/A N/A failures N/AVALUE VALUE VALUE checkpoint_completed N/AVALUE VALUE VALUE checkpoint_timeN/AVALUE VALUE VALUE checkpoint_completed_time N/AVALUE VALUE VALUE Where: session - only in XML output, Complete session URL which is used in Create command brick - Master Brick Node node - Master Node node_uuid - Master Node UUID, Only in XML output volume - Master Volume slave_user - Slave User slave_node - Slave node to which respective master worker is connected. status - Created/Initializing../Active/Passive/Faulty/Paused/Stopped last_synced - Last synced Time crawl_status - Hybrid/History/Changelog entry - Number of entry ops pending(per session, resets counter if worker restart) data - Number of data ops pending(per session, resets counter if worker restart) meta - Number of meta ops pending(per session, resets counter if worker restart) failures - Number of failures. (If count more than 0, then action item for admin to look in log files) checkpoint_completed - Checkpoint Status Yes/No/ N/A checkpoint_time - Checkpoint Set time or N/A checkpoint_completed_time - Checkpoint Completed Time or N/A Along with the monitor_status, if brick status is Faulty, following fields will be displayed as N/A. active, paused, slave_node, crawl_status, entry, data, metadata Let me know your thoughts. -- regards Aravinda On 02/03/2015 11:00 PM, Aravinda wrote: Today we discussed about Geo-rep Status design, summary of the discussion. - No usecase for "Deletes pending" column, should we retain it? - No separate column for Active/Passive. Worker can be Active/Passive only when worker is Stable(It can't be Faulty and Active) - Rename "Not Started" status as "Created" - Checkpoint columns will be retained in the Status output till we support Multiple checkpoints. Three columns instead of Single column(Completed, Checkpoint time and Completion time) - Still we have confusion about "Files Pending" and "Files Synced", What numbers it has to show. Georep can't map the number to exact count on disk. Venky suggested to show Entry, Data and Metadata pending as three columns. (Remove "Files Pending" and "Files Synced") - Rename "Files Skipped" to "Failures" Status output proposed: --- MASTER NODE - Master node hostname/IP MASTER VOL - Master volume name MASTER BRICK - Master brick path SLAVE USER - Slave user to which geo-rep is established. SLAVE - Slave host and Volume name(HOST::VOL format) STATUS - Created/Initializing../Started/Active/Passive/Stopped/Faulty LAST SYNCED - Last synced time(Based on stime xattr) CRAWL STATUS - Hybrid/History/Changelog CHECKPOINT STATUS - Yes/No/ N/A CHECKPOINT TIME - Checkpoint Set Time CHECKPOINT COMPLETED - Checkpoint Completion Time Not yet decided --- FILES SYNCD - Number of Files Synced FILES PENDING - Number of Files Pending DELETES PENDING- Number of Deletes Pending FILES SKIPPED - Number of Files skipped ENTRIES - Create/Delete/MKDIR/RENAME etc DATA - Data operations METADATA - SETATTR, SETXATTR etc Let me know your suggestions. -- regards Aravinda On 02/02/2015 04:51 PM, Aravinda wrote: Thanks Sahina, replied inline. -- regards Aravinda On 02/02/2015 12:55 PM,
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On Wed, Apr 01, 2015 at 12:50:30PM +0530, Vijay Bellur wrote: > I think all we need is to create another gerrit user like "NetBSD > Regression" or so and have verified votes routed through this user. Just as > multiple users can provide distinct CR votes, we can have multiple build > systems provide distinct Verified votes. Yes, plase do that: a nb7build user like we have a build user. It can have the same .ssh/authorized_keys as build user. -- Emmanuel Dreyfus m...@netbsd.org ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] rackspace-netbsd7-regression-triggered has been disabled
On 04/01/2015 09:42 AM, Justin Clift wrote: On 1 Apr 2015, at 05:04, Emmanuel Dreyfus wrote: Justin Clift wrote: That, or perhaps we could have two verified fields? Sure. Whichever works. :) Personally, I'm not sure how to do either yet. In http://build.gluster.org/gerrit-trigger/ you have "Verdict categories" with CRVW (code review) and VRIF (verified), and there is a "add verdict category", which suggest this is something that can be done. Of course the Gerrit side will need some configuration too, but if Jenkins can deal with more Gerrit fields, there must be a way to add fields in Gerrit. Interesting. Marcelo, this sounds like something you'd know about. Any ideas? :) We're trying to add an extra "Verified" column to our Gerrit + Jenkins setup. We have an existing one for "Gluster Build System" (which is our CentOS Regression testing). Now we want to add one for our NetBSD Regression testing. I think all we need is to create another gerrit user like "NetBSD Regression" or so and have verified votes routed through this user. Just as multiple users can provide distinct CR votes, we can have multiple build systems provide distinct Verified votes. I don't think we need an extra column here. -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel