[ 
https://issues.apache.org/jira/browse/BEAM-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985581#comment-15985581
 ] 

Eugene Kirpichov commented on BEAM-2081:
----------------------------------------

I agree with this guidance. The only downside is it doesn't tell the service 
what is the size of the files, but I'm leaning towards thinking that this is 
fine for new connectors.

> I/O Authoring overview - better clarify how to read from files
> --------------------------------------------------------------
>
>                 Key: BEAM-2081
>                 URL: https://issues.apache.org/jira/browse/BEAM-2081
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Stephen Sisk
>            Priority: Minor
>
> The I/O authoring doc is a little bit confusing - it has an example of 
> reading from file globs and says to use ParDos, but then mentions "A class 
> derived from FileBasedSource is often the best option when reading from files"
> It'd be nice to better clarify this and provide guidance as to when to use 
> which.
> I *think* the right answer here is that if you file is splittable you use FBS 
> (and let it handle the glob splitting), and if it's not splittable you use 
> ParDos.
> SDF I believe will make all this easier.
> cc [~kirpichov] [[email protected]]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to