[
https://issues.apache.org/jira/browse/BEAM-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024704#comment-17024704
]
Brian Hulette commented on BEAM-9189:
-------------------------------------
In a [dev@beam
message|https://lists.apache.org/thread.html/r6cd849b35f8a3cc86a1806bdca0d3b7b99c20f7a946fa63f8ccdfcf4%40%3Cdev.beam.apache.org%3E]
Pablo said we just need to tag it as a a potential project. I think the Feb
5th is for the entire ASF to apply. More details here:
https://community.apache.org/gsoc.html#prospective-asf-mentors-read-this
> Add Daffodil IO for Apache Beam
> -------------------------------
>
> Key: BEAM-9189
> URL: https://issues.apache.org/jira/browse/BEAM-9189
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-core
> Reporter: Brian Hulette
> Priority: Major
>
> From https://daffodil.apache.org/:
> {quote}Daffodil is an open source implementation of the DFDL specification
> that uses these DFDL schemas to parse fixed format data into an infoset,
> which is most commonly represented as either XML or JSON. This allows the use
> of well-established XML or JSON technologies and libraries to consume,
> inspect, and manipulate fixed format data in existing solutions. Daffodil is
> also capable of the reverse by serializing or “unparsing” an XML or JSON
> infoset back to the original data format.
> {quote}
> We should create a Beam IO that accepts a DFDL schema as an argument and can
> then produce and consume data in the specified format. I think it would be
> most natural for Beam users if this IO could produce Beam Rows, but an
> initial version that just operates with Infosets could be useful as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)