> On Fri, Sep 01, 2017 at 03:09:34PM -0700, Frank Filz wrote: > > Lately, we have been plagued by a lot of intermittent test failures. > > > > I have seen intermittent failures in pynfs WRT14, WRT15, and WRT16. > > These have not been resolved by the latest ntirpc pullup. > > > > Additionally, we see a lot of intermittent failures in the continuous > > integration. > > > > A big issue with the Centos CI is that it seems to have a fragile > > setup, and sometimes doesn't even succeed in trying to build Ganesha, > > and then fires a Verified -1. This makes it hard to evaluate what > > patches are actually ready for integration. > > We can look into this, but it helps if you can provide a link to the patch in > GerritHub or the job in the CI.
Here's one merged last week with a Gluster CI Verify -1: https://review.gerrithub.io/#/c/375463/ And just to preserve it in case... here's the log: Triggered by Gerrit: https://review.gerrithub.io/375463 in silent mode. [EnvInject] - Loading node environment variables. Building remotely on nfs-ganesha-ci-slave01 (nfs-ganesha) in workspace /home/nfs-ganesha/workspace/nfs-ganesha_trigger-fsal_gluster [nfs-ganesha_trigger-fsal_gluster] $ /bin/sh -xe /tmp/jenkins5031649144466335345.sh + set +x % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 1735 100 1735 0 0 8723 0 --:--:-- --:--:-- --:--:-- 8718 Traceback (most recent call last): File "bootstrap.py", line 33, in <module> b=json.loads(dat) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded https://ci.centos.org/job/nfs-ganesha_trigger-fsal_gluster/3455//console : FAILED Build step 'Execute shell' marked build as failure Finished: FAILURE Which tells me not much about why it failed, though it looks like a failure that has nothing to do with Ganesha... > > An additional issue with the Centos CI is that the failure logs often > > aren't preserved long enough to even diagnose the issue. > > That is something we can change. Some jobs do not delete the results, but > others seem to do. How long (in days), or how many results would you like to > keep? I'd say they need to be kept at least a week, if we could have time based retention rather than number of results retention, I think that would help. At least after a week, it's reasonable to expect folks to rebase their patches and re-submit, which would trigger a new run. > > The result is that honestly, I mostly ignore the Centos CI results. > > They almost might as well not be run... > > This is definitely not what we want, so lets fix the problems. Yea, and thus my rant... > > Let's talk about CI more on a near time concall (it would help if > > Niels and Jiffin could join a call to talk about this, our next call > > might be too soon for that). > > Tuesdays tend to be very busy for me, and I am not sure I can join the call > next week. Arthy did some work on the jobs in the CentOS CI, she could > probably work with Jiffin to make any changes that improve the experience > for you. I'm happy to help out where I can too, of course :-) If we can figure out another time to have a CI call, that would be helpful. It would be good to pull in Patrice from CEA as well as anyone else who cares. It would really help if we could have someone with better time zone overlap with me who could manage the CI stuff, but that may not be realistic. Frank p.s. here's another patch that had a failure, in this case, it looks like the CI ran again and passed 2nd time: https://review.gerrithub.io/#/c/377712/ Log: Triggered by Gerrit: https://review.gerrithub.io/377712 in silent mode. [EnvInject] - Loading node environment variables. Building remotely on nfs-ganesha-ci-slave01 (nfs-ganesha) in workspace /home/nfs-ganesha/workspace/nfs-ganesha_trigger-fsal_gluster [nfs-ganesha_trigger-fsal_gluster] $ /bin/sh -xe /tmp/jenkins2362831021118510052.sh + set +x % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 1735 100 1735 0 0 9161 0 --:--:-- --:--:-- --:--:-- 9179 Traceback (most recent call last): File "bootstrap.py", line 33, in <module> b=json.loads(dat) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded https://ci.centos.org/job/nfs-ganesha_trigger-fsal_gluster/3487//console : FAILED Build step 'Execute shell' marked build as failure Finished: FAILURE It also had this log that didn't seem to result in a Verify -1: Started by upstream project "nfs-ganesha_trigger-cthon04-on-new-patch" build number 1822 originally caused by: Triggered by Gerrit: https://review.gerrithub.io/377712 in silent mode. [EnvInject] - Loading node environment variables. Building remotely on nfs-ganesha-ci-slave01 (nfs-ganesha) in workspace /home/nfs-ganesha/workspace/nfs_ganesha_cthon04 [WS-CLEANUP] Deleting project workspace... [WS-CLEANUP] Done [nfs_ganesha_cthon04] $ /bin/sh -xe /tmp/jenkins2858843538932928042.sh + curl -o jenkins-job.py https://raw.githubusercontent.com/nfs-ganesha/ci-tests/centos-ci/common-scri pts/basic-gluster-duffy.py % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 3191 100 3191 0 0 14753 0 --:--:-- --:--:-- --:--:-- 14773 + python jenkins-job.py Traceback (most recent call last): File "jenkins-job.py", line 33, in <module> b=json.loads(dat) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded Build step 'Execute shell' marked build as failure Finished: FAILURE This really is a prime example of why I'm led to ignore the Centos CI results... --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel