i ran below command python -m apache_beam.examples.wordcount --input ./a_file --output ./data_test/ --runner=SparkRunner --spark_submit_uber_jar --spark_master_url=spark://spark:7077 --spark_rest_url=http://spark:6066 --environment_type=LOOPBACK
INFO:apache_beam.runners.worker.worker_pool_main:Listening for workers at localhost:35929 WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter. INFO:root:Default Python SDK image for environment is apache/beam_python3.7_sdk:2.31.0 INFO:apache_beam.runners.portability.fn_api_runner.translations:==================== <function pack_combiners at 0x7f71fe6e9440> ==================== INFO:apache_beam.runners.portability.fn_api_runner.translations:==================== <function lift_combiners at 0x7f71fe6e94d0> ==================== INFO:apache_beam.runners.portability.fn_api_runner.translations:==================== <function sort_stages at 0x7f71fe6e9c20> ==================== INFO:apache_beam.utils.subprocess_server:Downloading job server jar from https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-spark-job-server/2.31.0/beam-runners-spark-job-server-2.31.0.jar INFO:apache_beam.runners.portability.abstract_job_service:Artifact server started on port 35775 INFO:apache_beam.runners.portability.abstract_job_service:Running job 'job-21ac5231-a0f7-4e25-8d20-aa2c4c668944' INFO:apache_beam.runners.portability.spark_uber_jar_job_server:Submitted Spark job with ID driver-20210729024716-0000 INFO:apache_beam.runners.portability.portable_runner:Environment "LOOPBACK" has started a component necessary for the execution. Be sure to run the pipeline using with Pipeline() as p: p.apply(..) This ensures that the pipeline finishes before this program exits. INFO:apache_beam.runners.portability.portable_runner:Job state changed to STOPPED INFO:apache_beam.runners.portability.portable_runner:Job state changed to RUNNING ERROR:root:Exception from the cluster: java.nio.file.NoSuchFileException: /tmp/tmp982ms4ti.jar sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:526) sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253) java.nio.file.Files.copy(Files.java:1274) org.apache.spark.util.Utils$.copyRecursive(Utils.scala:726) org.apache.spark.util.Utils$.copyFile(Utils.scala:697) org.apache.spark.util.Utils$.doFetchFile(Utils.scala:771) org.apache.spark.util.Utils$.fetchFile(Utils.scala:541) org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162) org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180) org.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99) INFO:apache_beam.runners.portability.portable_runner:Job state changed to FAILED Traceback (most recent call last): File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.7/site-packages/apache_beam/examples/wordcount.py", line 94, in <module> run() File "/usr/local/lib/python3.7/site-packages/apache_beam/examples/wordcount.py", line 89, in run output | 'Write' >> WriteToText(known_args.output) File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 586, in __exit__ self.result.wait_until_finish() File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/portability/portable_runner.py", line 600, in wait_until_finish raise self._runtime_exception RuntimeError: Pipeline job-21ac5231-a0f7-4e25-8d20-aa2c4c668944 failed in state FAILED: Exception from the cluster: java.nio.file.NoSuchFileException: /tmp/tmp982ms4ti.jar sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:526) sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253) java.nio.file.Files.copy(Files.java:1274) org.apache.spark.util.Utils$.copyRecursive(Utils.scala:726) org.apache.spark.util.Utils$.copyFile(Utils.scala:697) org.apache.spark.util.Utils$.doFetchFile(Utils.scala:771) org.apache.spark.util.Utils$.fetchFile(Utils.scala:541) org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162) org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180) org.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99) root@6101ef4db75b:~# ls /tmp/tmp982ms4ti.jar /tmp/tmp982ms4ti.jar root@6101ef4db75b:~# ls -l /tmp/tmp982ms4ti.jar -rw-r--r-- 1 root root 192217430 Jul 29 02:47 /tmp/tmp982ms4ti.jar spark_1 | 21/07/29 02:47:16 INFO Master: Launching driver driver-20210729024716-0000 on worker worker-20210729024516-192.168.0.6-33991 spark-worker-2_1 | 21/07/29 02:47:16 INFO Worker: Asked to launch driver driver-20210729024716-0000 spark-worker-2_1 | 21/07/29 02:47:16 INFO DriverRunner: Copying user jar file:/tmp/tmp982ms4ti.jar to /opt/bitnami/spark/work/driver-20210729024716-0000/tmp982ms4ti.jar spark-worker-2_1 | 21/07/29 02:47:16 INFO Utils: Copying /tmp/tmp982ms4ti.jar to /opt/bitnami/spark/work/driver-20210729024716-0000/tmp982ms4ti.jar spark-worker-2_1 | 21/07/29 02:47:16 INFO DriverRunner: Killing driver process! spark-worker-2_1 | 21/07/29 02:47:16 WARN Worker: Driver driver-20210729024716-0000 failed with unrecoverable exception: java.nio.file.NoSuchFileException: /tmp/tmp982ms4ti.jar spark_1 | 21/07/29 02:47:16 INFO Master: Removing driver: driver-20210729024716-0000 there is someone else facing the same issue here https://stackoverflow.com/questions/66320831/its-possible-to-configure-the-beam-portable-runner-with-the-spark-configuration ,but no solution. Anyone can comment on the error message ? Thank you,Teoh
