[ 
https://issues.apache.org/jira/browse/BEAM-12464?focusedWorklogId=636844&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-636844
 ]

ASF GitHub Bot logged work on BEAM-12464:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Aug/21 10:10
            Start Date: 11/Aug/21 10:10
    Worklog Time Spent: 10m 
      Work Description: reubenvanammers commented on pull request #14974:
URL: https://github.com/apache/beam/pull/14974#issuecomment-896694002


   > My understanding of the problem here is that [OneOfType.create(List, 
Map)](https://beam.apache.org/releases/javadoc/2.31.0/org/apache/beam/sdk/schemas/logicaltypes/OneOfType.html#create-java.util.List-java.util.Map-)
 lets the user set arbitrary positions for each of the component types of the 
OneOfType. So there's no obvious place to put the oneof type in the schema. In 
the example in the test, the oneof component types are all consecutive (numbers 
2-5), so it makes sense to put the oneof at position 2 in the schema. But if 
they're nonconsecutive, say 2 and 4, it seems kind of arbitrary to put it at 
position 2 instead of position 4. So we may want to just keep the current 
behavior and document it, including helpful error messages if we can.
   
   Hi @ibzib , thanks for having a look!
   
   My understanding is that descriptor.getFields() returns the fields in order 
that they are in the .proto file, not in order of the field number. This means 
that in the previous as well as the proposed change, the order of the resultant 
schema is based on the ordering of the fields in the .proto file, and not the 
field descriptor number. Because of this, all of the oneof fields are 
necessarily going to be "clumped", as oneof blocks are contiguous. Therefore, 
the position of the first field in the oneofblock is a useful and natural 
location for the entirety of the oneof fields for that particular oneof.
   
   Does that make sense? Happy to write some unit tests for non monotonically 
increasing proto field descriptor numbers when I've got time if thats the main 
objection.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 636844)
    Time Spent: 2h  (was: 1h 50m)

> Change ProtoSchemaTranslator beam schema creation to match the order for 
> protobufs containing Oneof fields
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-12464
>                 URL: https://issues.apache.org/jira/browse/BEAM-12464
>             Project: Beam
>          Issue Type: Improvement
>          Components: extensions-java-protobuf
>            Reporter: Reuben van Ammers
>            Priority: P2
>              Labels: stale-P2
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, when ProtoSchemaTranslator creates the beam schema from a protobuf 
> definition it always puts the Oneofs at the start of the beam schema due to 
> Oneofs being created from the code first. This means that the order of the 
> fields doesn't match the order of the protobuf defintion. As the schema 
> generation is used when converting from beam rows to protobufs, it 
> additionally means that it is impossible to convert from a beam row where the 
> oneof fields are not the first fields in the beamrow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to