Currently running 4.6.0 built from source on Centos 7, but have had this 
problem with earlier versions as well.
Client is running on a virtual machine (vmware).

Every now and then, when sending files with the FTP protocol, lftp (in batch 
mode) hangs and uses 100% CPU on one core until being killed.
SIGHUP has no effect. SIGINT terminates it.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  384 username  20   0   25080   2860   1932 R  93.0  0.1 108:11.80 lftp
2543 username  20   0   25080   2860   1924 R  33.0  0.1  92:00.84 lftp

Typical batch file looks like this:

set dns:fatal-timeout 30
set net:max-retries 4
set cmd:fail-exit true
open hostname
user username password
put  "File1.pdf" -o "/path//File1.pdf"
put  "File2.pdf" -o "/path//File2.pdf"

Invoked with "lftp -f {path_to_batchfile} > {path_to_logfile} 2>&1" as a 
non-root user.

The {path_to_logfile} is 0 byte when the process hangs.

My guess is that it is some kind of timing issue or race condition that fails 
to pick up an error somehow, but it is very difficult to provide more 
information, since it is quite rare and impossible to reliably provoke it into 
happening again.
I use lots and lots of lftp calls daily, and this happens only intermittently 
and infrequently, so turning on debugging would cause a lot of extra output 
that is not really needed, so I haven't done that for most transfers.

Any ideas what could cause these hangs?

I have not compiled with debugging info, but if I attach with gdb, continue and 
interrupt many times, it seems to be consistently interrupted inside libc: 
(apologies if this is misleading information):

(gdb) where
#0  0x0000000000477c9e in ?? ()
#1  0x0000000000475f5e in ?? ()
#2  0x0000000000473fdf in ?? ()
#3  0x000000000045831d in ?? ()
#4  0x00000000004586a0 in ?? ()
#5  0x0000000000423bc6 in ?? ()
#6  0x0000000000423bfe in ?? ()
#7  0x00000000004115c6 in ?? ()
#8  0x0000000000448721 in ?? ()
#9  0x0000000000448929 in ?? ()
#10 0x000000000040e03d in ?? ()
#11 0x00000000004060b3 in ?? ()
#12 0x00007f2d4d41bb15 in __libc_start_main () from /lib64/libc.so.6
#13 0x0000000000407575 in ?? ()
(gdb) n
Cannot find bounds of current function
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007f2d4d441957 in vfprintf () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function vfprintf,
which has no line number information.
0x00007f2d4d46e589 in vsnprintf () from /lib64/libc.so.6
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007f2d4d442ac0 in vfprintf () from /lib64/libc.so.6
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007f2d4d420489 in __gconv_transform_ascii_internal () from /lib64/libc.so.6
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007f2d4d441967 in vfprintf () from /lib64/libc.so.6
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x0000000000478e40 in ?? ()
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x00007f2d4d4418a9 in vfprintf () from /lib64/libc.so.6

/proc/{pid}/io of the lftp process doesn't change:

rchar: 612533
wchar: 131756
syscr: 101
syscw: 28
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0

/proc/{pid}/sched of the lftp process changes:

lftp ({pid}, #threads: 1)
-------------------------------------------------------------------
se.exec_start                                :    4812627832.697335 (increases)
se.vruntime                                  :    5842041293.625487 (increases)
se.sum_exec_runtime                          :       6292136.992748 (increases)
nr_switches                                  :               425117 (increases)
nr_voluntary_switches                        :                  303
nr_involuntary_switches                      :               424814 (increases)
se.load.weight                               :                 1024
policy                                       :                    0
prio                                         :                  120
clock-delta                                  :                   34 (increases)
mm->numa_scan_seq                            :                    0
numa_migrations, 0
numa_faults_memory, 0, 0, 1, 0, -1
numa_faults_memory, 1, 0, 0, 0, -1


I've enabled debugging on some transfers now, hoping to catch another one of 
these hangs within a month or so.

Kind regards, and keep up the good work!
LFTP is really good
Mattias Bergvall
_______________________________________________
lftp mailing list
lftp@uniyar.ac.ru
http://univ.uniyar.ac.ru/mailman/listinfo/lftp

Reply via email to