[ https://issues.apache.org/jira/browse/BEAM-5918?focusedWorklogId=160685&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-160685 ]
ASF GitHub Bot logged work on BEAM-5918: ---------------------------------------- Author: ASF GitHub Bot Created on: 30/Oct/18 15:47 Start Date: 30/Oct/18 15:47 Worklog Time Spent: 10m Work Description: kanterov opened a new pull request #6888: [BEAM-5918] Add Cast transform for Rows URL: https://github.com/apache/beam/pull/6888 Casts rows from one schema, into another. Implements: - widening values (e.g., int -> long), to be extended with more conversions - narrowwing (e.g., int -> short), to be extended with more conversions - ignoring nullability (nullable=true -> nullable=false) - weakening nullability (nullable=false -> nullable=true) - projection (Schema(a: Int32, b: Int32) -> Schema(a: Int32)) It would be very useful for Row-based IO-s, for instance, BeamBigQueryTable can be implemented with org.apache.beam.sdk.schemas.utils.AvroUtils and Cast, and this will make it more flexible, now it's very restrictive to the schema. Another example is reading AVRO GenericRecord as user-provided POJO, [BEAM-5807](https://issues.apache.org/jira/browse/BEAM-5807). I want to get an initial port of feedback before polishing Javadoc, API, etc. ------------------------ Follow this checklist to help us incorporate your contribution quickly and easily: - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). It will help us expedite review of your Pull Request if you tag someone (e.g. `@username`) to look at it. Post-Commit Tests Status (on master branch) ------------------------------------------------------------------------------------------------ Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/) [](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/) Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/) | --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) </br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | --- | --- ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 160685) Time Spent: 10m Remaining Estimate: 0h > Add Cast transform for Rows > --------------------------- > > Key: BEAM-5918 > URL: https://issues.apache.org/jira/browse/BEAM-5918 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core > Reporter: Gleb Kanterov > Assignee: Kenneth Knowles > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > There is a need for a generic transform that given two Row schemas will > convert rows between them. There must be a possibility to opt-out from > certain kind of conversions, for instance, converting ints to shorts can > cause overflow. Another example, a schema could have a nullable field, but > never have NULL value in practice, because it was filtered out. > What is needed: > - widening values (e.g., int -> long) > - narrowwing (e.g., int -> short) > - runtime check for overflow while narrowing > - ignoring nullability (nullable=true -> nullable=false) > - weakening nullability (nullable=false -> nullable=true) > - projection (Schema(a: Int32, b: Int32) -> Schema(a: Int32)) -- This message was sent by Atlassian JIRA (v7.6.3#76005)