> On Fri, Sep 01, 2017 at 03:09:34PM -0700, Frank Filz wrote:
> > Lately, we have been plagued by a lot of intermittent test failures.
> >
> > I have seen intermittent failures in pynfs WRT14, WRT15, and WRT16.
> > These have not been resolved by the latest ntirpc pullup.
> >
> > Additionally, we see a lot of intermittent failures in the continuous
> > integration.
> >
> > A big issue with the Centos CI is that it seems to have a fragile
> > setup, and sometimes doesn't even succeed in trying to build Ganesha,
> > and then fires a Verified -1. This makes it hard to evaluate what
> > patches are actually ready for integration.
> 
> We can look into this, but it helps if you can provide a link to the patch
in
> GerritHub or the job in the CI.

Here's one merged last week with a Gluster CI Verify -1:

https://review.gerrithub.io/#/c/375463/

And just to preserve it in case... here's the log:

Triggered by Gerrit: https://review.gerrithub.io/375463 in silent mode.
[EnvInject] - Loading node environment variables.
Building remotely on nfs-ganesha-ci-slave01 (nfs-ganesha) in workspace
/home/nfs-ganesha/workspace/nfs-ganesha_trigger-fsal_gluster
[nfs-ganesha_trigger-fsal_gluster] $ /bin/sh -xe
/tmp/jenkins5031649144466335345.sh
+ set +x
  % Total    % Received % Xferd  Average Speed   Time    Time     Time
Current
                                 Dload  Upload   Total   Spent    Left
Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
0
100  1735  100  1735    0     0   8723      0 --:--:-- --:--:-- --:--:--
8718
Traceback (most recent call last):
  File "bootstrap.py", line 33, in <module>
    b=json.loads(dat)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
https://ci.centos.org/job/nfs-ganesha_trigger-fsal_gluster/3455//console :
FAILED
Build step 'Execute shell' marked build as failure
Finished: FAILURE

Which tells me not much about why it failed, though it looks like a failure
that has nothing to do with Ganesha...

> > An additional issue with the Centos CI is that the failure logs often
> > aren't preserved long enough to even diagnose the issue.
> 
> That is something we can change. Some jobs do not delete the results, but
> others seem to do. How long (in days), or how many results would you like
to
> keep?

I'd say they need to be kept at least a week, if we could have time based
retention rather than number of results retention, I think that would help.

At least after a week, it's reasonable to expect folks to rebase their
patches and re-submit, which would trigger a new run.

> > The result is that honestly, I mostly ignore the Centos CI results.
> > They almost might as well not be run...
> 
> This is definitely not what we want, so lets fix the problems.

Yea, and thus my rant...

> > Let's talk about CI more on a near time concall (it would help if
> > Niels and Jiffin could join a call to talk about this, our next call
> > might be too soon for that).
> 
> Tuesdays tend to be very busy for me, and I am not sure I can join the
call
> next week. Arthy did some work on the jobs in the CentOS CI, she could
> probably work with Jiffin to make any changes that improve the experience
> for you. I'm happy to help out where I can too, of course :-)

If we can figure out another time to have a CI call, that would be helpful.
It would be good to pull in Patrice from CEA as well as anyone else who
cares.

It would really help if we could have someone with better time zone overlap
with me who could manage the CI stuff, but that may not be realistic.

Frank

p.s. here's another patch that had a failure, in this case, it looks like
the CI ran again and passed 2nd time:

https://review.gerrithub.io/#/c/377712/

Log:

Triggered by Gerrit: https://review.gerrithub.io/377712 in silent mode.
[EnvInject] - Loading node environment variables.
Building remotely on nfs-ganesha-ci-slave01 (nfs-ganesha) in workspace
/home/nfs-ganesha/workspace/nfs-ganesha_trigger-fsal_gluster
[nfs-ganesha_trigger-fsal_gluster] $ /bin/sh -xe
/tmp/jenkins2362831021118510052.sh
+ set +x
  % Total    % Received % Xferd  Average Speed   Time    Time     Time
Current
                                 Dload  Upload   Total   Spent    Left
Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
0
100  1735  100  1735    0     0   9161      0 --:--:-- --:--:-- --:--:--
9179
Traceback (most recent call last):
  File "bootstrap.py", line 33, in <module>
    b=json.loads(dat)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
https://ci.centos.org/job/nfs-ganesha_trigger-fsal_gluster/3487//console :
FAILED
Build step 'Execute shell' marked build as failure
Finished: FAILURE

It also had this log that didn't seem to result in a Verify -1:

Started by upstream project "nfs-ganesha_trigger-cthon04-on-new-patch" build
number 1822
originally caused by:
 Triggered by Gerrit: https://review.gerrithub.io/377712 in silent mode.
[EnvInject] - Loading node environment variables.
Building remotely on nfs-ganesha-ci-slave01 (nfs-ganesha) in workspace
/home/nfs-ganesha/workspace/nfs_ganesha_cthon04
[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Done
[nfs_ganesha_cthon04] $ /bin/sh -xe /tmp/jenkins2858843538932928042.sh
+ curl -o jenkins-job.py
https://raw.githubusercontent.com/nfs-ganesha/ci-tests/centos-ci/common-scri
pts/basic-gluster-duffy.py
  % Total    % Received % Xferd  Average Speed   Time    Time     Time
Current
                                 Dload  Upload   Total   Spent    Left
Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--
0
100  3191  100  3191    0     0  14753      0 --:--:-- --:--:-- --:--:--
14773
+ python jenkins-job.py
Traceback (most recent call last):
  File "jenkins-job.py", line 33, in <module>
    b=json.loads(dat)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Build step 'Execute shell' marked build as failure
Finished: FAILURE

This really is a prime example of why I'm led to ignore the Centos CI
results...



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to