damccorm opened a new issue, #20729:
URL: https://github.com/apache/beam/issues/20729

   People still write sources, where 90% of the time they shouldn't. We tell 
them [not 
to](https://beam.apache.org/documentation/io/developing-io-overview/), but we 
should do so more effectively. In particular, the instructions for the ParDo 
alternative suffer from not being able to name Reshuffle explicitly, when it's 
exactly what should be used here. It should also mention that the ParDo needs 
to be seeded by a Create step or similar.
   
   A big issue here is that Sources are called "Sources". When a new developer 
is looking to author a pipeline, this is the first place they will look, 
especially if they're just scanning or searching through documentation. We need 
to aggressively counteract the gravity of the current naming scheme.
   
   Suggestion: Improve the documentation mentioned above, and update the 
Javadoc for BoundedSource, etc., to steer people away from it. If they are part 
of the small collection of power users who need a source, they'll be okay.
   
   Suggestions for future work:
    - Consider deprecating source framework in favor of SDF.
    - Point to SDF docs (and simplify SDF docs)
    - Also many users can simply just use FileIO.matchAll followed by a ParDo. 
Recommend those types of alternatives.
   
   Assigning this to [~boyuanz] anyone could help here.
    /cc [~kenn] [~chamikara] [~reuvenlax] [~robertwb] [~rtnguyen] [~dcavazos]
   
   Imported from Jira 
[BEAM-11633](https://issues.apache.org/jira/browse/BEAM-11633). Original Jira 
may contain additional context.
   Reported by: altay.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to