[ 
https://issues.apache.org/jira/browse/BEAM-5918?focusedWorklogId=160730&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-160730
 ]

ASF GitHub Bot logged work on BEAM-5918:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Oct/18 16:55
            Start Date: 30/Oct/18 16:55
    Worklog Time Spent: 10m 
      Work Description: kanterov edited a comment on issue #6888: [BEAM-5918] 
[WIP] Add Cast transform for Rows
URL: https://github.com/apache/beam/pull/6888#issuecomment-434380987
 
 
   @kennknowles yes, I agree, it's very controversial, but there are cases 
where it makes a lot of sense, for instance, BigQuery exports:
   - BQ:`DATE`, AVRO: `string`
   - BQ:`DATETIME`, AVRO: `string`
   - BQ: `TIMESTAMP`, AVRO: `long`, `logicalType=timestamp-micros`
   
   It needs to be converted to `Row`. The idea is not to have a global registry 
but override it per transform. For instance:
   
   ```java
   Cast
     .to(...)
     .with(StringToDateConversion.of())
     .with(...)
     .build()
     .apply(...)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 160730)
    Time Spent: 2h 50m  (was: 2h 40m)

> Add Cast transform for Rows
> ---------------------------
>
>                 Key: BEAM-5918
>                 URL: https://issues.apache.org/jira/browse/BEAM-5918
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Gleb Kanterov
>            Assignee: Kenneth Knowles
>            Priority: Major
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> There is a need for a generic transform that given two Row schemas will 
> convert rows between them. There must be a possibility to opt-out from 
> certain kind of conversions, for instance, converting ints to shorts can 
> cause overflow. Another example, a schema could have a nullable field, but 
> never have NULL value in practice, because it was filtered out.
> What is needed:
> - widening values (e.g., int -> long)
> - narrowwing (e.g., int -> short)
> - runtime check for overflow while narrowing
> - ignoring nullability (nullable=true -> nullable=false)
> - weakening nullability (nullable=false -> nullable=true)
> - projection (Schema(a: Int32, b: Int32) -> Schema(a: Int32))



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to