[GitHub] [iceberg] elkhand opened a new issue #2235: Flink schema validation always has checkNullability and checkOrdering set true. Any particular reason ?

GitBox Sat, 13 Feb 2021 09:16:07 -0800


elkhand opened a new issue #2235:
URL: https://github.com/apache/iceberg/issues/2235



   Hi,
   
   Flink schema validation (`TypeUtil.validateWriteSchema(schema, writeSchema, 
true, true);`) always has `checkNullability` and `checkOrdering` set true. 
   
   Any particular reason for this, @openinx ?
   
   In `org.apache.iceberg.flink.sink.FlinkSink` class:
   
   ```
   static RowType toFlinkRowType(Schema schema, TableSchema requestedSchema) {
       if (requestedSchema != null) {
         // Convert the flink schema to iceberg schema firstly, then reassign 
ids to match the existing iceberg schema.
         Schema writeSchema = 
TypeUtil.reassignIds(FlinkSchemaUtil.convert(requestedSchema), schema);
         TypeUtil.validateWriteSchema(schema, writeSchema, true, true);
   ...
   ```
   
   In Spark, both are read from specified options `checkNullability` and 
`checkOrdering`:
   
   In `org.apache.iceberg.spark.source.SparkWriteBuilder` class:
   
   for batch: 
   ```
    @Override
     public BatchWrite buildForBatch() {
       // Validate
       Schema writeSchema = SparkSchemaUtil.convert(table.schema(), dsSchema);
       TypeUtil.validateWriteSchema(table.schema(), writeSchema,
           checkNullability(spark, options), checkOrdering(spark, options));
   ...
   ```
   
   for Streaming:
   ```
   @Override
     public StreamingWrite buildForStreaming() {
       // Validate
       Schema writeSchema = SparkSchemaUtil.convert(table.schema(), dsSchema);
       TypeUtil.validateWriteSchema(table.schema(), writeSchema,
           checkNullability(spark, options), checkOrdering(spark, options));
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] elkhand opened a new issue #2235: Flink schema validation always has checkNullability and checkOrdering set true. Any particular reason ?

Reply via email to