Vincent Bernardi created BEAM-14493:
---------------------------------------
Summary: HdfsDownloader gets wrong range
Key: BEAM-14493
URL: https://issues.apache.org/jira/browse/BEAM-14493
Project: Beam
Issue Type: Bug
Components: io-py-hadoop
Affects Versions: 2.31.0
Reporter: Vincent Bernardi
Trying to read avro data from HDFS from a python sidecar worker fails with:
File "python3.7/site-packages/apache_beam/io/filesystemio.py", line 123, in
readinto
b[:len(data)] = data
ValueError: memoryview assignment: lvalue and rvalue have different structures
This is the same issue as https://issues.apache.org/jira/browse/BEAM-9152 which
was marked as resolved without being resolved at all.
As remarked by Jean-Christophe CARLES on
https://issues.apache.org/jira/browse/BEAM-9152 , patching hadoopfilesystem.py
by removing " + 1" in HdfsDownloader.get_range fixes it for us.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)