[ 
https://issues.apache.org/jira/browse/GSOC-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny McCormick updated GSOC-258:
---------------------------------
    Issue Type: New Feature  (was: Task)

> [GSOC][Beam] Build out Beam Yaml features
> -----------------------------------------
>
>                 Key: GSOC-258
>                 URL: https://issues.apache.org/jira/browse/GSOC-258
>             Project: Comdev GSOC
>          Issue Type: New Feature
>            Reporter: Danny McCormick
>            Priority: Major
>              Labels: beam, gsoc, gsoc2024
>
> Apache Beam is a unified model for defining both batch and streaming 
> data-parallel processing pipelines, as well as a set of language-specific 
> SDKs for constructing pipelines and Runners for executing them on distributed 
> processing backends. Beam recently added support for launching jobs using 
> Yaml on top of its other SDKs, this project would focus on adding more 
> features and transforms to the Yaml SDK so that it can be the easiest way to 
> define your data pipelines.
> Objectives:
> 1. Add support for existing Beam transforms (IOs, Machine Learning 
> transforms, and others) to the Yaml SDK
> 2. Add end to end pipeline use cases using the Yaml SDK
> 3. (stretch) Add Yaml SDK support to the Beam playground
> Useful links:
> Apache Beam repo - [https://github.com/apache/beam]
> Yaml SDK code + docs - 
> [https://github.com/apache/beam/tree/master/sdks/python/apache_beam/yaml]
> Open issues for the Yaml SDK - 
> [https://github.com/apache/beam/issues?q=is%3Aopen+is%3Aissue+label%3Ayaml]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: gsoc-unsubscr...@community.apache.org
For additional commands, e-mail: gsoc-h...@community.apache.org

Reply via email to