Daniel Oliveira created BEAM-12730:
--------------------------------------
Summary: Add custom delimiters to Python TextIO reads
Key: BEAM-12730
URL: https://issues.apache.org/jira/browse/BEAM-12730
Project: Beam
Issue Type: New Feature
Components: io-py-common, io-py-files
Reporter: Daniel Oliveira
A common request by users is to be able to separate a text files read by TextIO
with delimiters other than newline. The Java SDK already supports this feature.
The current delimiter code is [located
here|https://github.com/apache/beam/blob/v2.31.0/sdks/python/apache_beam/io/textio.py#L236]
and defaults to newlines. This function could easily be modified to also
handle custom delimiters. Changing this would also necessitate changing the API
for the various TextIO.Read methods and adding documentation.
This seems like a good starter bug for making more in-depth contributions to
Beam Python.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)