damccorm opened a new issue, #20509:
URL: https://github.com/apache/beam/issues/20509

   ### Problem/Status
   The schema proto has an [encoding_position 
field](https://github.com/apache/beam/blob/2c619c81082839e054f16efee9311b9f74b6e436/model/pipeline/src/main/proto/schema.proto#L55)
 that is currently unused in every row coder implementation. The intention of 
this field is that it indicates an alternative order for the fields to be 
encoded in by [beam:coder:row:v1 
implementations](https://github.com/apache/beam/blob/1e60f383fb39b9ff8d44edcbe5357da4c1e52378/model/pipeline/src/main/proto/beam_runner_api.proto#L937-L990).
 Currently all the implementations ignore this field, and always encode the 
fields in the order that they appear in the schema.
   
   ### Motivation
   The idea with the encoding position is that it will give runners a way to 
enforce schema compatibility (BEAM-9502), by re-ordering the way fields are 
encoded when the schema changes between two job submissions. Schema changes 
could be due to fields re-ordering, or field additions/deletions.
   
   ### Code pointers
   The Python beam:coder:row:v1 implementation lives in 
[row_coder.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py)
   The Java implementation is a little more complicated, distributed between 
[SchemaCoder](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/SchemaCoder.java),
 
[RowCoder](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoder.java),
 and 
[RowCoderGenerator](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java).
 RowCoderGenerator contains the code relevant to this jira - it uses bytebuddy 
to generate Java code for the coder. We need it to generate code that puts 
fields in the order specified by encoding_position.
   
   ### Testing
   Python and Java implementations should be tested with unit tests 
([RowCoderTest](https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/RowCoderTest.java),
 
[row_coder_test](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder_test.py)).
 We should also test them for compatibility by adding test cases that exercise 
the encoding_position in 
[standard_coders.yaml](https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml).
 These tests will be executed by 
[CommonCoderTest](https://github.com/apache/beam/blob/master/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/CommonCoderTest.java)
 and 
[standard_coders_test](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/standard_coders_test.py).
 There's some example code for generating a new test case 
[here](https://github.com/apac
 
he/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L387-L400).
   
   Imported from Jira 
[BEAM-10277](https://issues.apache.org/jira/browse/BEAM-10277). Original Jira 
may contain additional context.
   Reported by: bhulette.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to