[
https://issues.apache.org/jira/browse/BEAM-13454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brian Hulette updated BEAM-13454:
---------------------------------
Fix Version/s: 2.36.0
Resolution: Fixed
Status: Resolved (was: Open)
> Dataframe read_fwf fails reading incrementally.
> -----------------------------------------------
>
> Key: BEAM-13454
> URL: https://issues.apache.org/jira/browse/BEAM-13454
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Robert Bradshaw
> Assignee: Robert Bradshaw
> Priority: P2
> Fix For: 2.36.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When trying to use beam.dataframe.io.read_fwf one gets the error.
> {code:python}
> File
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
> line 1206, in process_with_sized_restriction
> return self.do_fn_invoker.invoke_process(
> File
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
> line 698, in invoke_process
> residual = self._invoke_process_per_window(
> File
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
> line 836, in _invoke_process_per_window
> self.output_processor.process_outputs(
> File
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/common.py",
> line 1334, in process_outputs
> for result in results:
> File
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/dataframe/io.py",
> line 545, in process
> frames = reader(handle, *self.args, **self.kwargs)
> File
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
> line 848, in read_fwf
> return _read(filepath_or_buffer, kwds)
> File
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
> line 454, in _read
> parser = TextFileReader(fp_or_buf, **kwds)
> File
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
> line 942, in __init__
> self.engine = self._check_file_or_buffer(f, engine)
> File
> "/Users/robertwb/Work/beam/venv-3.8/lib/python3.8/site-packages/pandas/io/parsers.py",
> line 1003, in _check_file_or_buffer
> raise ValueError(msg)
> ValueError: The 'python' engine cannot iterate through this file buffer.
> {code}
> Looks like pandas is expecting the file handle to be (line) iterable as well
> as supporting read().
--
This message was sent by Atlassian Jira
(v8.20.1#820001)