Re: Problem with s390x koji builder

2021-01-20 Thread Florian Weimer
* Jakub Jelinek:

> It is definitely not valid OpenMP, because it is racy (that if (rc) part
> with tasks writing that var).
> It would need to use atomic accesses to rc, like:
>   #pragma omp atomic write
>   rc = pkg->rc;
> instead of #pragma omp critical and
>   rpmRC testrc;
>   #pragma omp atomic read
>   testrc = rc;
>   if (testrc)
>   break;
> But, that shouldn't be the reason why it crashed.

Is there anything I can do to help to debug this?

Debuginfo quality around is very poor, for example I can't see if the
value of npkgs is correct at the point where packageBinaries calls
GOMP_parallel.

Is there a way to get the type of the struct that contains the per-task
data?

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Problem with s390x koji builder

2021-01-20 Thread Petr Pisar
Forwarded to .

-- Petr
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Problem with s390x koji builder

2021-01-20 Thread Jakub Jelinek
On Wed, Jan 20, 2021 at 11:18:16AM +0100, Florian Weimer wrote:
> So it's a null pointer dereference.
> 
> 
> 745  * (largest first) to help achieve an optimal load distribution.
> 746  */
> 747 rpmRC packageBinaries(rpmSpec spec, const char *cookie, int cheating)
> 748 {
> 749 rpmRC rc = RPMRC_OK;
> 750 Package pkg;
> 751 Package *tasks;
> 752 int npkgs = 0;
> 753
> 754 for (pkg = spec->packages; pkg != NULL; pkg = pkg->next)
> 755 npkgs++;
> 756 tasks = xcalloc(npkgs, sizeof(Package));
> 757
> 758 pkg = spec->packages;
> 759 for (int i = 0; i < npkgs; i++) {
> 760 tasks[i] = pkg;
> 761 pkg = pkg->next;
> 762 }
> 763 qsort(tasks, npkgs, sizeof(Package), compareBinaries);
> 764
> 765 #pragma omp parallel
> 766 #pragma omp single
> 767 for (int i = 0; i < npkgs; i++) {
> 768 Package pkg = tasks[i];
> 769 #pragma omp task untied priority(i)
> 770 {
> 771 pkg->rc = packageBinary(spec, pkg, cookie, cheating, 
> >filename);
> 772 rpmlog(RPMLOG_DEBUG,
> 773 _("Finished binary package job, result %d, filename 
> %s\n"),
> 774 pkg->rc, pkg->filename);
> 775 if (pkg->rc) {
> 776 #pragma omp critical
> 777 rc = pkg->rc;
> 778 }
> 779 } /* omp task */
> 780 if (rc)
> 781 break;
> 782 }

It is definitely not valid OpenMP, because it is racy (that if (rc) part
with tasks writing that var).
It would need to use atomic accesses to rc, like:
#pragma omp atomic write
rc = pkg->rc;
instead of #pragma omp critical and
rpmRC testrc;
#pragma omp atomic read
testrc = rc;
if (testrc)
break;
But, that shouldn't be the reason why it crashed.

Jakub
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Re: Problem with s390x koji builder

2021-01-20 Thread Florian Weimer
* Juan Orti Alcaine:

> I also got the same error:
>
> https://koji.fedoraproject.org/koji/taskinfo?taskID=60083125

I can reproduce it locally.  It's a crash in rpmbuild:

Thread 1 "rpmbuild" received signal SIGSEGV, Segmentation fault.
0x03fffde89e60 in packageBinaries._omp_fn.0 () at pack.c:780
780 if (rc)
(gdb) bt
#0  0x03fffde89e60 in packageBinaries._omp_fn.0 () at pack.c:780
#1  0x03fffca94806 in GOMP_parallel (
fn=0x3fffde89d80 , data=0x3fff478, 
num_threads=2, flags=) at ../../../libgomp/parallel.c:178
#2  0x03fffde953fa in packageBinaries (cheating=0, cookie=0x0, 
spec=0x2aa00065570) at pack.c:765
#3  buildSpec (ts=, buildArgs=, 
spec=0x2aa00065570, what=) at build.c:411
#4  0x03fffde98074 in rpmSpecBuild (ts=, 
spec=, buildArgs=) at build.c:452
#5  0x02aa3e74 in buildForTarget (ts=0x2aa00069a80, 
arg=, ba=0x2aa7990 ) at rpmbuild.c:500
#6  0x02aa409a in build (ts=0x2aa00069a80, 
arg=0x3fffe3a "/builddir/build/SPECS/compsize.spec", rcfile=0x0, 
ba=0x2aa7990 ) at rpmbuild.c:552
#7  0x02aa2f84 in main (argc=, argv=)
at rpmbuild.c:690
(gdb) print rc
No symbol "rc" in current context.

As you can see, something is broken with debugging information.


(gdb) disassemble 
Dump of assembler code for function packageBinaries._omp_fn.0:
   0x03fffde89d80 <+0>: stmg%r6,%r15,48(%r15)
   0x03fffde89d86 <+6>: ear %r1,%a0
   0x03fffde89d8a <+10>:lgr %r14,%r15
   0x03fffde89d8e <+14>:lay %r15,-280(%r15)
   0x03fffde89d94 <+20>:aghi%r14,-24
   0x03fffde89d98 <+24>:std %f8,0(%r14)
   0x03fffde89d9c <+28>:std %f10,8(%r14)
   0x03fffde89da0 <+32>:std %f14,16(%r14)
   0x03fffde89da4 <+36>:sllg%r1,%r1,32
   0x03fffde89daa <+42>:ear %r1,%a1
   0x03fffde89dae <+46>:l   %r11,32(%r2)
   0x03fffde89db2 <+50>:stg %r1,200(%r15)
   0x03fffde89db8 <+56>:lg  %r10,16(%r2)
   0x03fffde89dbe <+62>:l   %r13,24(%r2)
   0x03fffde89dc2 <+66>:mvc 248(8,%r15),40(%r1)
   0x03fffde89dc8 <+72>:ld  %f8,8(%r2)
   0x03fffde89dcc <+76>:ld  %f10,0(%r2)
   0x03fffde89dd0 <+80>:lgr %r8,%r2
   0x03fffde89dd4 <+84>:brasl   %r14,0x3fffde87530 

   0x03fffde89dda <+90>:cije%r2,0,0x3fffde89e76 

   0x03fffde89de0 <+96>:cijnh   %r11,0,0x3fffde89e76 

   0x03fffde89de6 <+102>:   stg %r8,192(%r15)
   0x03fffde89dec <+108>:   la  %r7,208(%r15)
   0x03fffde89df0 <+112>:   la  %r1,28(%r8)
   0x03fffde89df4 <+116>:   lgr %r8,%r7
   0x03fffde89df8 <+120>:   lgdr%r7,%f8
   0x03fffde89dfc <+124>:   ldgr%f14,%r1
   0x03fffde89e00 <+128>:   lhi %r9,0
   0x03fffde89e04 <+132>:   lg  %r1,0(%r10)
   0x03fffde89e0a <+138>:   std %f10,208(%r15)
   0x03fffde89e0e <+142>:   std %f14,224(%r15)
   0x03fffde89e12 <+146>:   stg %r1,232(%r15)
   0x03fffde89e18 <+152>:   st  %r13,240(%r15)
   0x03fffde89e1c <+156>:   stg %r7,216(%r15)
   0x03fffde89e22 <+162>:   lgfr%r1,%r9
   0x03fffde89e26 <+166>:   stg %r1,184(%r15)
   0x03fffde89e2c <+172>:   mvghi   176(%r15),0
   0x03fffde89e32 <+178>:   mvghi   168(%r15),17
   0x03fffde89e38 <+184>:   mvghi   160(%r15),1
   0x03fffde89e3e <+190>:   lgr %r3,%r8
   0x03fffde89e42 <+194>:   lghi%r6,8
   0x03fffde89e46 <+198>:   lghi%r5,40
   0x03fffde89e4a <+202>:   lghi%r4,0
   0x03fffde89e4e <+206>:   larl%r2,0x3fffde910c0 

   0x03fffde89e54 <+212>:   brasl   %r14,0x3fffde87c70 
   0x03fffde89e5a <+218>:   lg  %r1,192(%r15)
=> 0x03fffde89e60 <+224>:   lt  %r1,28(%r1)
   0x03fffde89e66 <+230>:   jne 0x3fffde89e76 

   0x03fffde89e6a <+234>:   ahi %r9,1
   0x03fffde89e6e <+238>:   aghi%r10,8
   0x03fffde89e72 <+242>:   brct%r11,0x3fffde89e04 

   0x03fffde89e76 <+246>:   lg  %r1,200(%r15)
   0x03fffde89e7c <+252>:   clc 248(8,%r15),40(%r1)
   0x03fffde89e82 <+258>:   jne 0x3fffde89e9e 

   0x03fffde89e86 <+262>:   ld  %f8,256(%r15)
   0x03fffde89e8a <+266>:   ld  %f10,264(%r15)
   0x03fffde89e8e <+270>:   ld  %f14,272(%r15)
   0x03fffde89e92 <+274>:   lmg %r6,%r15,328(%r15)
   0x03fffde89e98 <+280>:   jg  0x3fffde895f0 
   0x03fffde89e9e <+286>:   brasl   %r14,0x3fffde88810 
<__stack_chk_fail@plt>
(gdb) print/x $r1
$1 = 0x0

So it's a null pointer dereference.


745  * (largest first) to help achieve an optimal load distribution.
746  */
747 rpmRC packageBinaries(rpmSpec spec, const char *cookie, int cheating)
748 {
749 rpmRC rc = RPMRC_OK;
750 Package pkg;
751 Package *tasks;
752 int npkgs = 0;
753
754 for (pkg = spec->packages; pkg != NULL; pkg = 

Re: Problem with s390x koji builder

2021-01-20 Thread Juan Orti Alcaine
I also got the same error:

https://koji.fedoraproject.org/koji/taskinfo?taskID=60083125

El mié, 20 ene 2021 a las 9:35, Vascom () escribió:

> What's wrong with s390x koji builder?
>
> I always see this error:
>
> Child return code was: -11
> EXCEPTION: [Error()]
> Traceback (most recent call last):
>   File "/usr/lib/python3.9/site-packages/mockbuild/trace_decorator.py",
> line 93, in trace
> result = func(*args, **kw)
>   File "/usr/lib/python3.9/site-packages/mockbuild/util.py", line 600,
> in do_with_status
> raise exception.Error("Command failed: \n # %s\n%s" % (command,
> output), child.returncode)
> mockbuild.exception.Error: Command failed:
>
> Other arches are OK.
>
> https://koji.fedoraproject.org/koji/taskinfo?taskID=60085705
>
> Need to fix it.
>
> --
> Best regards,
> Vasiliy Glazov
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
>
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org


Problem with s390x koji builder

2021-01-20 Thread Vascom
What's wrong with s390x koji builder?

I always see this error:

Child return code was: -11
EXCEPTION: [Error()]
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/mockbuild/trace_decorator.py",
line 93, in trace
result = func(*args, **kw)
  File "/usr/lib/python3.9/site-packages/mockbuild/util.py", line 600,
in do_with_status
raise exception.Error("Command failed: \n # %s\n%s" % (command,
output), child.returncode)
mockbuild.exception.Error: Command failed:

Other arches are OK.

https://koji.fedoraproject.org/koji/taskinfo?taskID=60085705

Need to fix it.

--
Best regards,
Vasiliy Glazov
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org