[ 
https://issues.apache.org/jira/browse/BEAM-13454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-13454:
---------------------------------
    Fix Version/s: 2.36.0
       Resolution: Fixed
           Status: Resolved  (was: Open)

> Dataframe read_fwf fails reading incrementally.
> -----------------------------------------------
>
>                 Key: BEAM-13454
>                 URL: https://issues.apache.org/jira/browse/BEAM-13454
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Robert Bradshaw
>            Assignee: Robert Bradshaw
>            Priority: P2
>             Fix For: 2.36.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When trying to use beam.dataframe.io.read_fwf one gets the error.
> {code:python}
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
>  line 1206, in process_with_sized_restriction
>     return self.do_fn_invoker.invoke_process(
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
>  line 698, in invoke_process
>     residual = self._invoke_process_per_window(
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
>  line 836, in _invoke_process_per_window
>     self.output_processor.process_outputs(
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
>  line 1334, in process_outputs
>     for result in results:
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/io.py",
>  line 545, in process
>     frames = reader(handle, *self.args, **self.kwargs)
>   File 
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
>  line 848, in read_fwf
>     return _read(filepath_or_buffer, kwds)
>   File 
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
>  line 454, in _read
>     parser = TextFileReader(fp_or_buf, **kwds)
>   File 
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
>  line 942, in __init__
>     self.engine = self._check_file_or_buffer(f, engine)
>   File 
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
>  line 1003, in _check_file_or_buffer
>     raise ValueError(msg)
> ValueError: The 'python' engine cannot iterate through this file buffer.
> {code}
> Looks like pandas is expecting the file handle to be (line) iterable as well 
> as supporting read().



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to