On 05/25/2012 04:52 PM, Dan Kenigsberg wrote:
On Fri, May 25, 2012 at 02:59:31AM +0800, ShaoHe Feng wrote:
On 05/23/2012 03:52 AM, Dan Kenigsberg wrote:
On Tue, May 22, 2012 at 05:02:01PM +0800, ShaoHe Feng wrote:
both mountTests  and parted_utils_tests failed.
Failed where? On your own host? Is it reproducible? We had a similar,
but transient, problem in
http://jenkins.ovirt.org/job/vdsm_unit_tests/143/console
yes it is reproducible.
bu if I shop the sandbox service, and restart, then this problem
does not occur.
if I sandbox service start, then this problem comes up again.
Pardon, but I am not familiar with the sandbox service. Could you
describe a complete reproducer?

# chkconfig sandbox off
 then restart fedora.
# ./run_tests.sh mountTests

no matter how times the test rum, then this problem does not comes up.



# chkconfig sandbox on
  then restart fedora.
# ./run_tests.sh mountTests
# ls /dev/loop*
there are 7 loop devices.
and I run the test 8 times, then the problem comes up.

losetup: could not find any free loop device

however after I make one loop block file by mknod , I can run test successfully 
one time.

We, or some other suite using the server, may be leaking a loop device.
Eyal Edri, do you know what has made this go away in run #144?

the test execute  'mount' and 'umount' command. after the umount
command, the loop device can not be freed

here is the log:
   -------------------->>   begin captured logging<<   --------------------
   Storage.Misc.excCmd: DEBUG: 'dd if=/dev/zero of=/tmp/tmpH2KSCr
bs=100M count=1' (cwd None)
   Storage.Misc.excCmd: DEBUG: SUCCESS:<err>   = '1+0 records in\n1+0
records out\n104857600 bytes (105 MB) copied, 0.266024 s, 39  4
MB/s\n';<rc>   = 0
   Storage.Misc.excCmd: DEBUG: 'losetup -f --show /tmp/tmpH2KSCr' (cwd None)
   Storage.Misc.excCmd: DEBUG: FAILED:<err>   = 'losetup: could not
find any free loop device\n';<rc>   = 255
   --------------------->>   end captured logging<<   ---------------------
Does your Linux host have a trace of the generated loop device?
What says
     losetup  -a
?

# losetup  -a
/dev/loop0: [fd03]:918121 (/tmp/tmpeihztM)
/dev/loop1: [fd03]:918122 (/tmp/tmp43EVnb)
/dev/loop2: [fd03]:918123 (/tmp/tmpCoknYi)
/dev/loop3: [fd03]:918124 (/tmp/tmp_PFqBx)
/dev/loop4: [fd03]:918125 (/tmp/tmplVEPQs)
/dev/loop5: [fd03]:918126 (/tmp/tmpQrHVKH)
/dev/loop6: [fd03]:918127 (/tmp/tmpZkZJ7V)
/dev/loop7: [fd03]:918128 (/tmp/tmpmUSR26)

# losetup -d /dev/loop0
loop: can't delete device /dev/loop0: Device or resource busy

# lsof -L | grep loop
loop0     11198          root  cwd       DIR              253,3
4096          2 /
Who is process 11198 ?
it is a kernel thread.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 11198 0.0 0.0 0 0 ? S< 02:32 0:00 [loop0]


loop0     11198          root  rtd       DIR              253,3
4096          2 /
loop0     11198          root  txt   unknown
/proc/11198/exe
loop1     11287          root  cwd       DIR              253,3
4096          2 /
loop1     11287          root  rtd       DIR              253,3
4096          2 /
loop1     11287          root  txt   unknown
/proc/11287/exe
loop2     11309          root  cwd       DIR              253,3
4096          2 /
loop2     11309          root  rtd       DIR              253,3
4096          2 /
loop2     11309          root  txt   unknown
/proc/11309/exe
loop3     11327          root  cwd       DIR              253,3
4096          2 /
loop3     11327          root  rtd       DIR              253,3
4096          2 /
loop3     11327          root  txt   unknown
/proc/11327/exe
loop4     11350          root  cwd       DIR              253,3
4096          2 /
loop4     11350          root  rtd       DIR              253,3
4096          2 /
loop4     11350          root  txt   unknown
/proc/11350/exe
loop5     11372          root  cwd       DIR              253,3
4096          2 /
loop5     11372          root  rtd       DIR              253,3
4096          2 /
loop5     11372          root  txt   unknown
/proc/11372/exe
loop6     11391          root  cwd       DIR              253,3
4096          2 /
loop6     11391          root  rtd       DIR              253,3
4096          2 /
loop6     11391          root  txt   unknown
/proc/11391/exe
loop7     11408          root  cwd       DIR              253,3
4096          2 /
loop7     11408          root  rtd       DIR              253,3
4096          2 /
loop7     11408          root  txt   unknown
/proc/11408/exe


should I use the strace to watch the syscall about what happened to
/dev/loop
# strace -f -F -o ./strace.log ./run_tests.sh mountTests
or any other way to get more info?

_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to