Thomas Krause created BEAM-12771:
------------------------------------
Summary: InvalidPathException on Windows when removing staging
directory
Key: BEAM-12771
URL: https://issues.apache.org/jira/browse/BEAM-12771
Project: Beam
Issue Type: Bug
Components: io-java-files, java-fn-execution, sdk-java-core
Affects Versions: 2.31.0
Environment: Windows
Reporter: Thomas Krause
When running the word count example on windows using e.g. the Flink runner an
InvalidPathException is thrown when execution is finished. The message is:
{code:java}
INFO:apache_beam.utils.subprocess_server:b'WARNUNG: Failed to remove job
staging directory for token job_941a4c92-f66d-4d71-8b68-b16cd5026750.'
INFO:apache_beam.utils.subprocess_server:b'java.nio.file.InvalidPathException:
Illegal char <*> at index 136:
C:\\Users\\Thomas\\AppData\\Local\\Temp\\beam-tempgkbgwplc\\artifactsuu86zi08\\d82c20307c13adfba9486c5a114a464cc3fb072ad60623a58839256a91a6c9f8\\*'
INFO:apache_beam.utils.subprocess_server:b'\tat
sun.nio.fs.WindowsPathParser.normalize(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat
sun.nio.fs.WindowsPathParser.parse(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat
sun.nio.fs.WindowsPathParser.parse(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat
sun.nio.fs.WindowsPath.parse(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat
sun.nio.fs.WindowsFileSystem.getPath(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.nio.file.Paths.get(Unknown
Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.sdk.io.LocalResourceId.resolveLocalPathWindowsOS(LocalResourceId.java:103)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.sdk.io.LocalResourceId.resolve(LocalResourceId.java:65)'
INFO:apache_beam.runners.portability.portable_runner:Job state changed to DONE
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.sdk.io.LocalResourceId.resolve(LocalResourceId.java:36)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.fnexecution.artifact.ArtifactStagingService$1.removeStagedArtifacts(ArtifactStagingService.java:182)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.fnexecution.artifact.ArtifactStagingService.removeStagedArtifacts(ArtifactStagingService.java:115)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.jobsubmission.JobServerDriver.lambda$createJobService$0(JobServerDriver.java:66)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.jobsubmission.InMemoryJobService.lambda$run$0(InMemoryJobService.java:261)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.jobsubmission.JobInvocation.setState(JobInvocation.java:249)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.jobsubmission.JobInvocation.access$200(JobInvocation.java:51)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:115)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.runners.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:101)'
INFO:apache_beam.utils.subprocess_server:b'\tat
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1058)'
INFO:apache_beam.utils.subprocess_server:b'\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)'
INFO:apache_beam.utils.subprocess_server:b'\tat java.lang.Thread.run(Unknown
Source)'{code}
The error seems to be in this function:
{code:java}
private LocalResourceId resolveLocalPathWindowsOS(String other, ResolveOptions
resolveOptions) {
String uuid = UUID.randomUUID().toString();
Path pathAsterisksReplaced = Paths.get(pathString.replaceAll("\\*", uuid));
String otherAsterisksReplaced = other.replaceAll("\\*", uuid);
return new LocalResourceId(
Paths.get(
pathAsterisksReplaced
.resolve(otherAsterisksReplaced)
.toString()
.replaceAll(uuid, "\\*")),
resolveOptions.equals(StandardResolveOptions.RESOLVE_DIRECTORY));
}
{code}
Paths.get throws an exception since it does not support wildcards on windows.
It seems the function already takes care of replaceing the wildcard with 'uuid'
on the first call to Paths.get, but then in the return statement Paths.get is
called again on a string where uuid is replaced with the wildcard again, which
of course throws the exception.
Unfortunately I don't really understand the logic of the function, so I'm not
sure what the best fix would be.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)