Kyle Weaver created BEAM-9509:
---------------------------------
Summary: Subprocess job server treats missing local file as remote
URL
Key: BEAM-9509
URL: https://issues.apache.org/jira/browse/BEAM-9509
Project: Beam
Issue Type: Bug
Components: runner-spark
Reporter: Kyle Weaver
Assignee: Kyle Weaver
When the job server jar requested (e.g. by portableWordCountSparkRunnerBatch)
is missing (such as when it hasn't yet been built), the error message is
misleading. Expected behavior is that the jar is recognized as a local file,
and a message is printed instructing the user to build it.
INFO:apache_beam.utils.subprocess_server:Downloading job server jar from
/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/runners/spark/job-server/build/libs/beam-runners-spark-job-server-2.21.0-SNAPSHOT.jar
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/examples/wordcount.py",
line 142, in <module>
run()
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/examples/wordcount.py",
line 121, in run
result = p.run()
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/pipeline.py",
line 495, in run
self._options).run(False)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/pipeline.py",
line 508, in run
return self.runner.run_pipeline(self, self._options)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/spark_runner.py",
line 45, in run_pipeline
return super(SparkRunner, self).run_pipeline(pipeline, options)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
line 386, in run_pipeline
job_service_handle = self.create_job_service(options)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py",
line 293, in create_job_service
return JobServiceHandle(server.start(), options)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
line 86, in start
self._endpoint = self._job_server.start()
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
line 111, in start
cmd, endpoint = self.subprocess_cmd_and_endpoint()
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
line 156, in subprocess_cmd_and_endpoint
jar_path = self.local_jar(self.path_to_jar())
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/runners/portability/job_server.py",
line 153, in local_jar
return subprocess_server.JavaJarServer.local_jar(url)
File
"/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/apache_beam/utils/subprocess_server.py",
line 206, in local_jar
url_read = urlopen(url)
File "/usr/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.7/urllib/request.py", line 510, in open
req = Request(fullurl, data)
File "/usr/lib/python3.7/urllib/request.py", line 328, in __init__
self.full_url = url
File "/usr/lib/python3.7/urllib/request.py", line 354, in full_url
self._parse()
File "/usr/lib/python3.7/urllib/request.py", line 383, in _parse
raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type:
'/usr/local/google/home/kcweaver/go/src/github.com/apache/beam/runners/spark/job-server/build/libs/beam-runners-spark-job-server-2.21.0-SNAPSHOT.jar'
--
This message was sent by Atlassian Jira
(v8.3.4#803005)