[
https://issues.apache.org/jira/browse/BEAM-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143916#comment-16143916
]
Eugene Kirpichov commented on BEAM-2802:
----------------------------------------
Hmm, I have a hard time thinking why somebody would think it's a good idea to
store their data in a file format like this, that is not readable by a human
and has no library support - especially the 1-line text file. But oh well.
This feature feels very exotic to me (I've never heard of file formats like
this) and I'm still hesitant to add it to TextIO directly, because this is no
longer a text file in the conventional sense. Would you consider creating a
separate IO for this? Or - are such file formats sufficiently common to merit
being included in the Beam SDK at all? (the other option would be for you to
develop it and ship to your clients separately from Beam SDK)
> TextIO should allow specifying a custom delimiter
> -------------------------------------------------
>
> Key: BEAM-2802
> URL: https://issues.apache.org/jira/browse/BEAM-2802
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-extensions
> Reporter: Etienne Chauchot
> Assignee: Etienne Chauchot
> Priority: Minor
>
> Currently TextIO use {{\r}} {{\n}} or {{\r\n}} or a mix of the two to split a
> text file into PCollection elements. It might happen that a record is spread
> across more than one line. In that case we should be able to specify a custom
> record delimiter to be used in place of the default ones.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)