Re: [Gluster-devel] Problem with smoke, regression ordering
On 05/07/2014, at 5:23 PM, Pranith Kumar Karampuri wrote: > hi Justin, > If the regression results complete before the smoke test then > 'green-tick-mark' is over-written and people don't realize that the > regression succeeded by a simple glance at the list of patches. Can we do > anything about it? Yeah. At the moment, it's caused by people manually starting the old "regression" job on the build.gluster.org server. The build.gluster.org server can only run one thing at a time. Everything else queues up. When the old regression test job runs, everything else is blocked until it finishes. If there are a few regression tests lined up (or it hangs), then it can take hours until the smoke and rpm building jobs run. There are a few ways we could address this: * Adjust the smoke test job so it runs on the Rackspace slaves Hopefully not hard. But not sure. We can try it out. * Change the triggered regression test, so it doesn't start automatically like this. * We may be able to get a successful smoke test to automatically trigger the regression run. Ben Turner would probably know how to make that work. * Niels has suggested we might want to have the regression test run when a +1 or +2 vote is given instead. I'm not really sure about this, because I wonder if it's more useful to automatically test everything. eg catching breakage early, before reviews are done I'm not strongly against it either though. ;) Personally, I reckon we should have a discussion on gluster-devel about this. There might be really good + / - for each, so a clear decision can be made. And there may be other better ideas too. What're your thoughts on this stuff? Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Problem with smoke, regression ordering
On 06/07/2014, at 8:03 AM, Justin Clift wrote: > There are a few ways we could address this: > > * Adjust the smoke test job so it runs on the Rackspace > slaves > > Hopefully not hard. But not sure. We can try it out. > > * Change the triggered regression test, so it doesn't > start automatically like this. * We could also disable the old "regression" job, so it doesn't run on build.gluster.org. Would stop the queuing problems. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Problem with smoke, regression ordering
On 07/06/2014 12:33 PM, Justin Clift wrote: On 05/07/2014, at 5:23 PM, Pranith Kumar Karampuri wrote: hi Justin, If the regression results complete before the smoke test then 'green-tick-mark' is over-written and people don't realize that the regression succeeded by a simple glance at the list of patches. Can we do anything about it? Yeah. At the moment, it's caused by people manually starting the old "regression" job on the build.gluster.org server. The build.gluster.org server can only run one thing at a time. Everything else queues up. When the old regression test job runs, everything else is blocked until it finishes. If there are a few regression tests lined up (or it hangs), then it can take hours until the smoke and rpm building jobs run. There are a few ways we could address this: * Adjust the smoke test job so it runs on the Rackspace slaves Hopefully not hard. But not sure. We can try it out. * Change the triggered regression test, so it doesn't start automatically like this. * We may be able to get a successful smoke test to automatically trigger the regression run. Ben Turner would probably know how to make that work. * Niels has suggested we might want to have the regression test run when a +1 or +2 vote is given instead. Like Avati said a while back it depends on what you want to optimize on. Human review time (or) number of automatic regression job runs. I would like Human review time to be optimized by automatically triggering the regression runs and let the regressions catch some bugs. As a rule I don't review patches that are yet to pass regressions. Pranith I'm not really sure about this, because I wonder if it's more useful to automatically test everything. eg catching breakage early, before reviews are done I'm not strongly against it either though. ;) Personally, I reckon we should have a discussion on gluster-devel about this. There might be really good + / - for each, so a clear decision can be made. And there may be other better ideas too. What're your thoughts on this stuff? I like the present model. The only thing I feel needs a change is smoke test resetting the 'regression status' Pranith Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Problem with smoke, regression ordering
On 07/06/2014 12:36 PM, Justin Clift wrote: On 06/07/2014, at 8:03 AM, Justin Clift wrote: There are a few ways we could address this: * Adjust the smoke test job so it runs on the Rackspace slaves Hopefully not hard. But not sure. We can try it out. * Change the triggered regression test, so it doesn't start automatically like this. * We could also disable the old "regression" job, so it doesn't run on build.gluster.org. This only reduces the race window. Doesn't fix the problem ;-). But I see your point. It is definitely better to not use build.gluster.org for regressions. Pranith Would stop the queuing problems. ;) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Problem with smoke, regression ordering
On 06/07/2014, at 8:24 AM, Pranith Kumar Karampuri wrote: > On 07/06/2014 12:36 PM, Justin Clift wrote: >> * We could also disable the old "regression" job, so it >> doesn't run on build.gluster.org. > This only reduces the race window. Doesn't fix the problem ;-). But I see > your point. It is definitely better to not use build.gluster.org for > regressions. k, just disabled it. I can look into making the smoke jobs run in Rackspace too (during the week). :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] regarding message for '-1' on gerrit
hi Justin/Vijay, I always felt '-1' saying 'I prefer you didn't submit this' is a bit harsh. Most of the times all it means is 'Need some more changes' Do you think we can change this message? Pranith ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] FS Sanity daily results.
On 07/06/2014 02:53 AM, Benjamin Turner wrote: Hi all. I have been running FS sanity on daily builds(glusterfs mounts only at this point) for a few days for a few days and I have been hitting a couple of problems: final pass/fail report = Test Date: Sat Jul 5 01:53:00 EDT 2014 Total : [44] Passed: [41] Failed: [3] Abort : [0] Crash : [0] - [ PASS ] FS Sanity Setup [ PASS ] Running tests. [ PASS ] FS SANITY TEST - arequal [ PASS ] FS SANITY LOG SCAN - arequal [ PASS ] FS SANITY LOG SCAN - bonnie [ PASS ] FS SANITY TEST - glusterfs_build [ PASS ] FS SANITY LOG SCAN - glusterfs_build [ PASS ] FS SANITY TEST - compile_kernel [ PASS ] FS SANITY LOG SCAN - compile_kernel [ PASS ] FS SANITY TEST - dbench [ PASS ] FS SANITY LOG SCAN - dbench [ PASS ] FS SANITY TEST - dd [ PASS ] FS SANITY LOG SCAN - dd [ PASS ] FS SANITY TEST - ffsb [ PASS ] FS SANITY LOG SCAN - ffsb [ PASS ] FS SANITY TEST - fileop [ PASS ] FS SANITY LOG SCAN - fileop [ PASS ] FS SANITY TEST - fsx [ PASS ] FS SANITY LOG SCAN - fsx [ PASS ] FS SANITY LOG SCAN - fs_mark [ PASS ] FS SANITY TEST - iozone [ PASS ] FS SANITY LOG SCAN - iozone [ PASS ] FS SANITY TEST - locks [ PASS ] FS SANITY LOG SCAN - locks [ PASS ] FS SANITY TEST - ltp [ PASS ] FS SANITY LOG SCAN - ltp [ PASS ] FS SANITY TEST - multiple_files [ PASS ] FS SANITY LOG SCAN - multiple_files [ PASS ] FS SANITY TEST - posix_compliance [ PASS ] FS SANITY LOG SCAN - posix_compliance [ PASS ] FS SANITY TEST - postmark [ PASS ] FS SANITY LOG SCAN - postmark [ PASS ] FS SANITY TEST - read_large [ PASS ] FS SANITY LOG SCAN - read_large [ PASS ] FS SANITY TEST - rpc [ PASS ] FS SANITY LOG SCAN - rpc [ PASS ] FS SANITY TEST - syscallbench [ PASS ] FS SANITY LOG SCAN - syscallbench [ PASS ] FS SANITY TEST - tiobench [ PASS ] FS SANITY LOG SCAN - tiobench [ PASS ] FS Sanity Cleanup [ FAIL ] FS SANITY TEST - bonnie [ FAIL ] FS SANITY TEST - fs_mark [ FAIL ] /rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2 Bonnie++ is just very slow(running for 10+ hours on 1 16 GB file) and FS mark has been failing. The bonnie slowness is in re read, here is the best explanation I can find on it: https://blogs.oracle.com/roch/entry/decoding_bonnie *Rewriting...done* This gets a little interesting. It actually reads 8K, lseek back to the start of the block, overwrites the 8K with new data and loops. (see article for more.). On FS mark I am seeing: # fs_mark -d . -D 4 -t 4 -S 5 # Version 3.3, 4 thread(s) starting at Sat Jul 5 00:54:00 2014 # Sync method: POST: Reopen and fsync() each file in order after main write loop. # Directories: Time based hash between directories across 4 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 51200 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse%Count SizeFiles/sec App Overhead Error in unlink of ./00/53b784e8SKZ0QS9BO7O2EG1DIFQLRDYY : No such file or directory fopen failed to open: fs_log.txt.26676 fs-mark pass # 5 failed I am working on reporting so look for a daily status report email from my jenkins server soon. How do we want to handle failures like this moving forward? Should I just open a BZ after I triage? Do you guys do a new BZ for every failure in the normal regressions tests? Yes bz would be great with all the logs. For spurious regressions at least I just opened one bz and fixed all the bugs reported by Justin against that one. Pranith -b ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding message for '-1' on gerrit
On 07/06/2014 07:47 PM, Pranith Kumar Karampuri wrote: hi Justin/Vijay, I always felt '-1' saying 'I prefer you didn't submit this' is a bit harsh. Most of the times all it means is 'Need some more changes' Do you think we can change this message? The message can be changed. What would everyone like to see as appropriate messages accompanying values '-1' and '-2'? -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] 3.6 Feature Freeze - move to mid next week?
On 07/05/2014 01:54 AM, Justin Clift wrote: On 04/07/2014, at 12:30 PM, Vijay Bellur wrote: Hi All, Given the holiday weekend in US, I feel that it would be appropriate to move the 3.6 feature freeze date to mid next week so that we can have more reviews done & address review comments too. We can still continue to track other milestones as per our release schedule [1]. What do you folks think? How about end of next week? The extra few days could make a positive difference as to what gets in. (?) I think we can give this a shot. Let us move feature freeze to July 12th and not alter other milestones. Thanks, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] regarding message for '-1' on gerrit
On 07/06/2014 11:05 PM, Vijay Bellur wrote: On 07/06/2014 07:47 PM, Pranith Kumar Karampuri wrote: hi Justin/Vijay, I always felt '-1' saying 'I prefer you didn't submit this' is a bit harsh. Most of the times all it means is 'Need some more changes' Do you think we can change this message? The message can be changed. What would everyone like to see as appropriate messages accompanying values '-1' and '-2'? For '-1' - 'Please address the comments and Resubmit.' I am not sure about '-2' Pranith -Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel