inaddy@ocfs2defrag:~$ sudo cat /proc/1452/stack
[<0>] 0x0

inaddy@ocfs2defrag:~$ while true; do sudo cat /proc/1452/stack; done > bleh.txt
inaddy@ocfs2defrag:~$ cat bleh.txt | sort -u
[<0>] 0x0

Stack does not help me at all (not being updated in execution path from
the user<-> kernel context switch). Still, process is consuming100% of
user time:

%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,
0.0 st

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 1452 root      20   0    1936    428    360 R 100.0   0.0   6:27.67 
defragfs.ocfs2   

Initial check on time spend scheduling to discard kernel issue like I
suspected before:

inaddy@ocfs2defrag:~$ sudo perf sched timehist -p 1452

           time    cpu  task name                       wait time  sch delay   
run time
                        [tid/pid]                          (msec)     (msec)    
 (msec)
--------------- ------  ------------------------------  ---------  ---------  
---------
     111.555973 [0003]  defragfs.ocfs2[1452]                0.000      0.000    
  0.000
     112.123934 [0003]  defragfs.ocfs2[1452]                0.029      0.000    
567.930
     113.571854 [0003]  defragfs.ocfs2[1452]                0.018      0.000   
1447.901
     115.139790 [0003]  defragfs.ocfs2[1452]                0.027      0.000   
1567.908
     115.555743 [0003]  defragfs.ocfs2[1452]                0.030      0.000    
415.921
     116.123707 [0003]  defragfs.ocfs2[1452]                0.027      0.000    
567.936
     116.867670 [0003]  defragfs.ocfs2[1452]                0.019      0.000    
743.943
     117.571630 [0003]  defragfs.ocfs2[1452]                0.025      0.000    
703.934
     118.243597 [0003]  defragfs.ocfs2[1452]                0.026      0.000    
671.940
     118.443583 [0003]  defragfs.ocfs2[1452]                0.028      0.000    
199.957
     119.587530 [0003]  defragfs.ocfs2[1452]                0.017      0.000   
1143.929
     120.123500 [0003]  defragfs.ocfs2[1452]                0.023      0.000    
535.945
     121.571458 [0003]  defragfs.ocfs2[1452]                0.016      0.000   
1447.941
     123.587358 [0003]  defragfs.ocfs2[1452]                0.027      0.000   
2015.871
     124.123365 [0003]  defragfs.ocfs2[1452]                0.036      0.000    
535.970
     125.123284 [0003]  defragfs.ocfs2[1452]                0.020      0.000    
999.899
     125.603259 [0003]  defragfs.ocfs2[1452]                0.032      0.000    
479.942
     126.883212 [0003]  defragfs.ocfs2[1452]                0.029      0.000   
1279.924
     127.587179 [0003]  defragfs.ocfs2[1452]                0.028      0.000    
703.938
     128.123153 [0003]  defragfs.ocfs2[1452]                0.027      0.000    
535.946 

It spends almost no time in waiting for CPU AND absolutely no time in
scheduling (as its the only task currently really running), so its
something in userland indeed.... debugging it:

(gdb)

#0  0x0000ffffbf618b10 in _getopt_internal_r 
    (argc=3, argv=0xfffffffff868, optstring=0xaaaaaaaacfc0 "gvclh",
    longopts=0x0, longind=0x0, long_only=0, d=d@entry=0xffffbf6cad88 
<getopt_data>, 
    posixly_correct=<optimized out>)
    at getopt.c:508

#1  0x0000ffffbf618f0c in _getopt_internal
    (argc=<optimized out>, argv=<optimized out>, optstring=<optimized out>,
    longopts=<optimized out>, longind=<optimized out>, long_only=<optimized 
out>, 
    posixly_correct=<optimized out>)
    at getopt.c:711

#2  0x0000aaaaaaaab43c in parse_opt
    (_mode_flag=<synthetic pointer>, argv=0xfffffffff868, argc=3) at main.c:585

#3  main (argc=3, argv=0xfffffffff868) at main.c:653

after some time:

(gdb) bt

#0  0x0000ffffbf618950 in _getopt_internal_r 
    (argc=3, argv=0xfffffffff868, optstring=0xaaaaaaaacfc0 "gvclh",
    longopts=0x0, longind=0x0, long_only=0, d=d@entry=0xffffbf6cad88 
<getopt_data>, 
    posixly_correct=<optimized out>)
    at getopt.c:500

#1  0x0000ffffbf618f0c in _getopt_internal 
    (argc=<optimized out>, argv=<optimized out>, optstring=<optimized out>,
    longopts=<optimized out>, longind=<optimized out>, long_only=<optimized 
out>, 
    posixly_correct=<optimized out>)
    at getopt.c:711

#2  0x0000aaaaaaaab43c in parse_opt
    (_mode_flag=<synthetic pointer>, argv=0xfffffffff868, argc=3) at main.c:585

#3  main (argc=3, argv=0xfffffffff868) at main.c:653

after some more time:

(gdb) bt

#0  0x0000ffffbf618cec in _getopt_internal_r 
    (argc=3, argv=0xfffffffff868, optstring=0xaaaaaaaacfc0 "gvclh",
    longopts=0x0, longind=0x0, long_only=0, d=d@entry=0xffffbf6cad88 
<getopt_data>,   
    posixly_correct=<optimized out>)
    at getopt.c:512

#1  0x0000ffffbf618f0c in _getopt_internal
    (argc=<optimized out>, argv=<optimized out>, optstring=<optimized out>,
    longopts=<optimized out>, longind=<optimized out>, long_only=<optimized 
out>, 
    posixly_correct=<optimized out>)
    at getopt.c:711

#2  0x0000aaaaaaaab43c in parse_opt 
    (_mode_flag=<synthetic pointer>, argv=0xfffffffff868, argc=3) at main.c:585

#3  main (argc=3, argv=0xfffffffff868) at main.c:653

Checking glibc getopt.c code (around lines 500 <-> 512) I can see we are
stuck in this loop:

  if (d->__nextchar == NULL || *d->__nextchar == '\0')
    {
      /* Advance to the next ARGV-element.  */
      ...
    }

And I'll try to discover why tomorrow (and report upstream).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840958

Title:
  defragfs.ocfs2 hangs (or takes too long) on arm64, ppc64el

To manage notifications about this bug go to:
https://bugs.launchpad.net/ocfs2-tools/+bug/1840958/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to