[jira] [Comment Edited] (SPARK-23015) spark-submit fails when submitting several jobs in parallel

Kevin Grealish (JIRA) Wed, 24 Oct 2018 12:48:10 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-23015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662750#comment-16662750
 ]


Kevin Grealish edited comment on SPARK-23015 at 10/24/18 7:47 PM:
------------------------------------------------------------------

One workaround is to create a temp directory in temp and set that to be the 
TEMP directory for that process being launched. This way each process you 
launch gets its on temp speace. For example, when launching from C#:

{{
            // Workaround for Spark bug 
https://issues.apache.org/jira/browse/SPARK-23015
            // Spark Submit's launching library prints the command to execute 
the launcher (org.apache.spark.launcher.main) 
            // to a temporary text file 
("%TEMP%\spark-class-launcher-output-%RANDOM%.txt"), reads the result back into 
a variable,
            // and then executes that command. %RANDOM% does not have sufficent 
range to avoid collisions when launching many Spark processes.
            // As a result the Spark processes end up running one anothers' 
commands (silently) or gives an error like:
            // "The process cannot access the file because it is being used by 
another process."
            // "The system cannot find the file 
C:\VsoAgent\_work\_temp\spark-class-launcher-output-654.txt."
            // As a workaround, we give each run its own TEMP directory, we 
create using a GUID.
            string newTemp = null;
            if (AppRuntimeEnvironment.IsRunningOnWindows())
            {
                var ourTemp = Environment.GetEnvironmentVariable("TEMP");
                var newDirName = "dprep" + 
Convert.ToBase64String(Guid.NewGuid().ToByteArray()).Substring(0, 
22).Replace('/', '-');
                newTemp = Path.Combine(ourTemp, newDirName);
                Directory.CreateDirectory(newTemp);
                start.Environment["TEMP"] = newTemp;
            }

}}


was (Author: kevingre):
One workaround is to create a temp directory in temp and set that to be the 
TEMP directory for that process being launched. This way each process you 
launch gets its on temp speace. For example, when launching from C#:

```

            // Workaround for Spark bug 
https://issues.apache.org/jira/browse/SPARK-23015
            // Spark Submit's launching library prints the command to execute 
the launcher (org.apache.spark.launcher.main) 
            // to a temporary text file 
("%TEMP%\spark-class-launcher-output-%RANDOM%.txt"), reads the result back into 
a variable,
            // and then executes that command. %RANDOM% does not have sufficent 
range to avoid collisions when launching many Spark processes.
            // As a result the Spark processes end up running one anothers' 
commands (silently) or gives an error like:
            // "The process cannot access the file because it is being used by 
another process."
            // "The system cannot find the file 
C:\VsoAgent\_work\_temp\spark-class-launcher-output-654.txt."
            // As a workaround, we give each run its own TEMP directory, we 
create using a GUID.
            string newTemp = null;
            if (AppRuntimeEnvironment.IsRunningOnWindows())
            {
                var ourTemp = Environment.GetEnvironmentVariable("TEMP");
                var newDirName = "dprep" + 
Convert.ToBase64String(Guid.NewGuid().ToByteArray()).Substring(0, 
22).Replace('/', '-');
                newTemp = Path.Combine(ourTemp, newDirName);
                Directory.CreateDirectory(newTemp);
                start.Environment["TEMP"] = newTemp;
            }


 ```

> spark-submit fails when submitting several jobs in parallel
> -----------------------------------------------------------
>
>                 Key: SPARK-23015
>                 URL: https://issues.apache.org/jira/browse/SPARK-23015
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1
>         Environment: Windows 10 (1709/16299.125)
> Spark 2.3.0
> Java 8, Update 151
>            Reporter: Hugh Zabriskie
>            Priority: Major
>
> Spark Submit's launching library prints the command to execute the launcher 
> (org.apache.spark.launcher.main) to a temporary text file, reads the result 
> back into a variable, and then executes that command.
> {code}
> set LAUNCHER_OUTPUT=%temp%\spark-class-launcher-output-%RANDOM%.txt
> "%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main 
> %* > %LAUNCHER_OUTPUT%
> {code}
> [bin/spark-class2.cmd, 
> L67|https://github.com/apache/spark/blob/master/bin/spark-class2.cmd#L66]
> That temporary text file is given a pseudo-random name by the %RANDOM% env 
> variable generator, which generates a number between 0 and 32767.
> This appears to be the cause of an error occurring when several spark-submit 
> jobs are launched simultaneously. The following error is returned from stderr:
> {quote}The process cannot access the file because it is being used by another 
> process. The system cannot find the file
> USER/AppData/Local/Temp/spark-class-launcher-output-RANDOM.txt.
> The process cannot access the file because it is being used by another 
> process.{quote}
> My hypothesis is that %RANDOM% is returning the same value for multiple jobs, 
> causing the launcher library to attempt to write to the same file from 
> multiple processes. Another mechanism is needed for reliably generating the 
> names of the temporary files so that the concurrency issue is resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-23015) spark-submit fails when submitting several jobs in parallel

Reply via email to