[ 
https://issues.apache.org/jira/browse/MESOS-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732390#comment-14732390
 ] 

haosdent commented on MESOS-3349:
---------------------------------

The strange thing here is "Device or resource busy" happends when "os::rmdir" 
after umount successfully. I use "cat /proc/mounts" both before "os::umount" 
and after "os::umount" could see the mount point is removed from mount table. 
Also use "stat" to check the mount point, could see there are different before 
"os::umount" and "os::umount". And also try use "lsof +D ", "fuser -vm" to 
inspect which process still hold the folder before "os::rmdir", but could not 
found anything. 

And add "os::rmdir" in PersistentVolumeTest.AccessPersistentVolume also would 
return same error("Device or resource busy").
{code}
--- a/src/tests/persistent_volume_tests.cpp
+++ b/src/tests/persistent_volume_tests.cpp
@@ -576,6 +576,7 @@ TEST_F(PersistentVolumeTest, AccessPersistentVolume)
       frameworkId.get(),
       executorId);

+  Try<Nothing> rmdir = os::rmdir(path::join(directory, "path1"), true);
   EXPECT_FALSE(os::exists(path::join(directory, "path1")));
{code}

Only add add os::sleep() and then call os::rmdir could works.
{code}
-- a/src/slave/containerizer/isolators/filesystem/linux.cpp
+++ b/src/slave/containerizer/isolators/filesystem/linux.cpp
@@ -504,6 +504,7 @@ Future<Nothing> LinuxFilesystemIsolatorProcess::update(
     }

     // NOTE: This is a non-recursive rmdir.
+    os::sleep(Seconds(3));
     Try<Nothing> rmdir = os::rmdir(target, false);
     if (rmdir.isError()) {
{code}

I also try to minimize this problem, but my attempts failed so far. Below code 
snippet could works well.
{code}
fs::mount("/tmp/origin", "/tmp/mount", None(), MS_BIND, NULL);
fs::unmount("/tmp/mount");
os::rmdir("/tmp/mount");
{code}

If I change the target and source variable in 
"LinuxFilesystemIsolatorProcess::update", it could still got "Device or 
resource busy" when "os::rmdir".
{code}
diff --git a/src/slave/containerizer/isolators/filesystem/linux.cpp 
b/src/slave/containerizer/isolators/filesystem/linux.cpp
index a780b45..cdf6e80 100644
--- a/src/slave/containerizer/isolators/filesystem/linux.cpp
+++ b/src/slave/containerizer/isolators/filesystem/linux.cpp
@@ -490,6 +490,7 @@ Future<Nothing> LinuxFilesystemIsolatorProcess::update(
       target = path::join(info->directory, containerPath);
     }

+    target = "/tmp/mount";
     LOG(INFO) << "Removing mount '" << target << "' for persistent volume "
               << resource << " of container " << containerId;

@@ -578,6 +579,8 @@ Future<Nothing> LinuxFilesystemIsolatorProcess::update(
       target = path::join(info->directory, containerPath);
     }

+    source = "/tmp/origin";
+    target = "/tmp/mount";
     if (os::exists(target)) {
       // NOTE: This is possible because 'info->resources' will be
       // reset when slave restarts and recovers. When the slave calls
{code}

> PersistentVolumeTest.AccessPersistentVolume fails when run as root.
> -------------------------------------------------------------------
>
>                 Key: MESOS-3349
>                 URL: https://issues.apache.org/jira/browse/MESOS-3349
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>         Environment: Ubuntu 14.04, CentOS 5
>            Reporter: Benjamin Mahler
>            Assignee: haosdent
>              Labels: flaky-test
>
> When running the tests as root:
> {noformat}
> [ RUN      ] PersistentVolumeTest.AccessPersistentVolume
> I0901 02:17:26.435140 39432 exec.cpp:133] Version: 0.25.0
> I0901 02:17:26.442129 39461 exec.cpp:207] Executor registered on slave 
> 20150901-021726-1828659978-52102-32604-S0
> Registered executor on hostname
> Starting task d8ff1f00-e720-4a61-b440-e111009dfdc3
> sh -c 'echo abc > path1/file'
> Forked command at 39484
> Command exited with status 0 (pid: 39484)
> ../../src/tests/persistent_volume_tests.cpp:579: Failure
> Value of: os::exists(path::join(directory, "path1"))
>   Actual: true
> Expected: false
> [  FAILED  ] PersistentVolumeTest.AccessPersistentVolume (777 ms)
> {noformat}
> FYI [~jieyu] [~mcypark]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to