hi Nir!
was this fixed by the PR you submitted?
On Wed, Feb 24, 2021 at 8:55 AM Nir Gazit <[email protected]> wrote:
> Hey,
> When trying to read a file from S3 with a combine action, the pipeline
> seems to be stuck. When replacing it with a GCP source it works fine.
> Furthermore - if I comment out the Count.PerElement part it also works.
>
> Anyone has an idea why that is?
>
> lines = p | beam.io.ReadFromText('s3://...')
> transformed = (
> lines
> | 'SplitLine' >>
> (beam.ParDo(WordExtractingDoFn()).with_output_types(unicode))
> | 'Count' >> beam.combiners.Count.PerElement()
> | 'Format' >> beam.MapTuple(lambda w, c: f'{w}: {c}')
> )
>
> transformed | 'Write' >> beam.io.WriteToText('s3://...')
>
> Thanks!
> Nir
>