[ 
https://issues.apache.org/jira/browse/BEAM-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened BEAM-5026:
--------------------------------------
      Assignee: Ahmet Altay  (was: Ankur Goenka)

I see this consistently failing with the latest master, but only in batch mode. 
Reopening.

> Portable flink wordcount fails sometimes due to non-existent source path in 
> FileBasedSink._check_state_for_finalize_write
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-5026
>                 URL: https://issues.apache.org/jira/browse/BEAM-5026
>             Project: Beam
>          Issue Type: Bug
>          Components: examples-python, runner-flink, sdk-py-core
>    Affects Versions: 2.6.0
>            Reporter: Ryan Williams
>            Assignee: Ahmet Altay
>            Priority: Minor
>             Fix For: 2.7.0
>
>
> Running portable flink wordcount locally:
> In one terminal:
> {code:java}
> ./gradlew :beam-runners-flink_2.11-job-server:runShadow{code}
> In another:
> {code:java}
> python -m apache_beam.examples.wordcount --harness_docker_image <image> 
> --input /etc/profile --output /tmp/py-wordcount-direct 
> --experiments=beam_fn_api --runner=PortableRunner 
> --job_endpoint=localhost:8099 --sdk_location=container{code}
> Typically, the first time I run this for a given job-server instance, I see a 
> failure like this ([full 
> output|https://gist.github.com/ryan-williams/a96bf259898b6260cd4f00b8a232057c#file-gistfile1-txt-L3460]):
> {code:java}
> File "apache_beam/runners/common.py", line 661, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> def process_outputs(self, windowed_input_element, results):
> File "apache_beam/runners/common.py", line 676, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> for result in results:
> File "/usr/local/lib/python2.7/site-packages/apache_beam/io/iobase.py", line 
> 1074, in <genexpr>
> return (window.TimestampedValue(v, window.MAX_TIMESTAMP) for v in outputs)
> File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/io/filebasedsink.py", 
> line 271, in finalize_write
> self._check_state_for_finalize_write(writer_results, num_shards))
> File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/io/filebasedsink.py", 
> line 249, in _check_state_for_finalize_write
> src, dst))
> BeamIOError: src and dst files do not exist. src: 
> /tmp/beam-temp-py-wordcount-direct-6a0d8862908c11e88de8025000000001/5cfa9f22-9246-41fb-adef-ca04d5a5fe50.py-wordcount-direct,
>  dst: /tmp/py-wordcount-direct-00000-of-00001 with exceptions None [while 
> running 'write/Write/WriteImpl/FinalizeWrite'] with exceptions None
> {code}
> This is after a fix to [a slightly earlier failure in {{FileBasedSink}} 
> documented on 
> BEAM-4742|https://issues.apache.org/jira/browse/BEAM-4742?focusedCommentId=16545622&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16545622]
>  which I've been working on in 
> [#5903|https://github.com/apache/beam/pull/5903].
> It typically occurs only on the first run of wordcount against a given 
> job-server instance.
> I'm curious whether others see this, whether it's some race condition in the 
> FileBasedSink, LocalFileSystem, my macbook's disk, or somewhere else, or 
> whether some temporary directory is getting created on the first run (for 
> each job-server) that explains why subsequent wordcount runs succeed, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to