In order to not block other patches on CQ, I've sent [1] which will double the amount of space on the ISCSI SD (with the patch it will have 40GB).
As a side note, we use the same configuration on the master suite, which may explain why we don't see the issue there. [1] https://gerrit.ovirt.org/#/c/95922/ On Sun, Dec 2, 2018 at 5:41 PM Gal Ben Haim <gbenh...@redhat.com> wrote: > Below you can find 2 jobs, one that succeeded and the other failed on the > iscsi issue. > Both were triggered by unrelated patches. > > Success - > https://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3546/ > Failure - > https://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3544/ > > > On Sun, Dec 2, 2018 at 2:37 PM Gal Ben Haim <gbenh...@redhat.com> wrote: > >> Raz, thanks for the investigation. >> I'll send a patch for increasing the luns size. >> >> On Sun, Dec 2, 2018 at 1:27 PM Nir Soffer <nsof...@redhat.com> wrote: >> >>> On Sun, Dec 2, 2018, 10:44 Raz Tamir <rata...@redhat.com wrote: >>> >>>> After some analysis, I think the bug we are seeing here is >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1588061 >>>> This applies for suspend/resume and also for a snapshot with memory. >>>> Following the steps and considering that the iscsi storage domain is >>>> only 20GB, this should be the reason for reaching ~4GB free space >>>> >>> >>> >>> OST configuration should change so it is will not fail because of such >>> bugs. >>> >> >> I disagree. the purpose of OST it to catch bugs, not covering them. >> >>> >>> Iscsi storage can be created using sparse files, not consuming any >>> resources until you write to the lvs, so having 100g storage domain cost >>> nothing. >>> >> >> OST use sparse files. >> >>> >>> Nir >>> >>> >>>> On Fri, Nov 30, 2018 at 10:01 PM Raz Tamir <rata...@redhat.com> wrote: >>>> >>>>> >>>>> >>>>> On Fri, Nov 30, 2018, 21:57 Ryan Barry <rba...@redhat.com wrote: >>>>> >>>>>> >>>>>> >>>>>> On Fri, Nov 30, 2018 at 2:31 PM Raz Tamir <rata...@redhat.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Nov 30, 2018, 19:33 Dafna Ron <d...@redhat.com wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> This mail is to provide the current status of CQ and allow people >>>>>>>> to review status before and after the weekend. >>>>>>>> Please refer to below colour map for further information on the >>>>>>>> meaning of the colours. >>>>>>>> >>>>>>>> *CQ-4.2*: RED (#1) >>>>>>>> >>>>>>>> I checked last date ovirt-engine and vdsm passed and moved packages >>>>>>>> to tested as they are the bigger projects and it was on the 27-11-218. >>>>>>>> >>>>>>>> We have been having sporadic failures for most of the projects on >>>>>>>> test check_snapshot_with_memory. >>>>>>>> We have deducted that this is caused by a code regression in >>>>>>>> storage based on the following things: >>>>>>>> 1.Evgheni and Gal helped debug this issue to rule out lago and >>>>>>>> infra issue as the cause of failure and both determined the issue is a >>>>>>>> code >>>>>>>> regression - most likely in storage. >>>>>>>> 2. The failure only happens on 4.2 branch. >>>>>>>> 3. the failure itself is cannot run a vm due to low disk space in >>>>>>>> storage domain and we cannot see any failures which would leave any >>>>>>>> leftovers in the storage domain. >>>>>>>> >>>>>>> Can you please share the link to the execution? >>>>>>> >>>>>> >>>>>> Here's an example of one run: >>>>>> https://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3550/ >>>>>> >>>>>> The iSCSI storage domain starts emitting warnings about low storage >>>>>> space immediately after removing the VmPool, but it's possible that the >>>>>> storage domain is filling before that from some other call prior to that >>>>>> which is still running, possibly the VM import. >>>>>> >>>>> Thanks Ryan, I'll try to help with debugging this issue >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>>> Dan and Ryan are actively involved in trying to find the regression >>>>>>>> but the consensus is that this is a storage related regression and* >>>>>>>> we are having a problem getting the storage team to join us in >>>>>>>> debugging >>>>>>>> the issue. * >>>>>>>> >>>>>>>> I prepared a patch to skip the test in case we cannot get >>>>>>>> cooperation from storage team and resolve this regression in the next >>>>>>>> few >>>>>>>> days: >>>>>>>> https://gerrit.ovirt.org/#/c/95889/ >>>>>>>> >>>>>>>> *CQ-Master:* YELLOW (#1) >>>>>>>> >>>>>>>> We have failures which CQ is still bisecting and until its done we >>>>>>>> cannot point to any specific failing projects. >>>>>>>> >>>>>>>> >>>>>>>> Happy week! >>>>>>>> Dafna >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------------------------------------------- >>>>>>>> COLOUR MAP >>>>>>>> >>>>>>>> Green = job has been passing successfully >>>>>>>> >>>>>>>> ** green for more than 3 days may suggest we need a review of our >>>>>>>> test coverage >>>>>>>> >>>>>>>> >>>>>>>> 1. >>>>>>>> >>>>>>>> 1-3 days GREEN (#1) >>>>>>>> 2. >>>>>>>> >>>>>>>> 4-7 days GREEN (#2) >>>>>>>> 3. >>>>>>>> >>>>>>>> Over 7 days GREEN (#3) >>>>>>>> >>>>>>>> >>>>>>>> Yellow = intermittent failures for different projects but no >>>>>>>> lasting or current regressions >>>>>>>> >>>>>>>> ** intermittent would be a healthy project as we expect a number of >>>>>>>> failures during the week >>>>>>>> >>>>>>>> ** I will not report any of the solved failures or regressions. >>>>>>>> >>>>>>>> >>>>>>>> 1. >>>>>>>> >>>>>>>> Solved job failures YELLOW (#1) >>>>>>>> 2. >>>>>>>> >>>>>>>> Solved regressions YELLOW (#2) >>>>>>>> >>>>>>>> >>>>>>>> Red = job has been failing >>>>>>>> >>>>>>>> ** Active Failures. The colour will change based on the amount of >>>>>>>> time the project/s has been broken. Only active regressions would be >>>>>>>> reported. >>>>>>>> >>>>>>>> >>>>>>>> 1. >>>>>>>> >>>>>>>> 1-3 days RED (#1) >>>>>>>> 2. >>>>>>>> >>>>>>>> 4-7 days RED (#2) >>>>>>>> 3. >>>>>>>> >>>>>>>> Over 7 days RED (#3) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Ryan Barry >>>>>> >>>>>> Associate Manager - RHV Virt/SLA >>>>>> >>>>>> rba...@redhat.com M: +16518159306 IM: rbarry >>>>>> <https://red.ht/sig> >>>>>> >>>>> >>>> >>>> -- >>>> >>>> >>>> Raz Tamir >>>> Manager, RHV QE >>>> _______________________________________________ >>>> Devel mailing list -- devel@ovirt.org >>>> To unsubscribe send an email to devel-le...@ovirt.org >>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> oVirt Code of Conduct: >>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> List Archives: >>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/6EFAA4LR743GLDGGNVCK2PEOHL7USLB7/ >>>> >>> _______________________________________________ >>> Devel mailing list -- devel@ovirt.org >>> To unsubscribe send an email to devel-le...@ovirt.org >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZNMZS7V2TLRRXTYJ4EQ3R44Z634IL62T/ >>> >> >> >> -- >> *GAL bEN HAIM* >> RHV DEVOPS >> > > > -- > *GAL bEN HAIM* > RHV DEVOPS > -- *GAL bEN HAIM* RHV DEVOPS
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/MP277EZWHCFEHDHFSENQZWIVDXTLAP3I/