[GitHub] [hudi] xiarixiaoyao commented on issue #8020: [SUPPORT] org.apache.avro.AvroTypeException: Cannot encode decimal with precision 4 as max precision 2

via GitHub Thu, 23 Feb 2023 04:03:27 -0800


xiarixiaoyao commented on issue #8020:
URL: https://github.com/apache/hudi/issues/8020#issuecomment-1441644783


   @simonjobs
   Thank you for your feedback
   
   Unfortunately, the spark engine restricts you to do so.  
   Spark supports expanded precision， but you should expaned precision and 
scale at the same time.
   that is to say:  
   decimal(3, 0) -> decimal(4, 0)     // sad, forbiden by spark
   decimal(3,0) ->  decimal(4, 1)    // fine,   support 
   
   
   The following code is fine
   ```
     test("Test multi change data type3") {
       withRecordType()(withTempDir { tmp =>
         Seq("COPY_ON_WRITE").foreach { tableType =>
           val tableName = generateTableName
           val tablePath = s"${new Path(tmp.getCanonicalPath, 
tableName).toUri.toString}"
           if (HoodieSparkUtils.gteqSpark3_1) {
             spark.sql("set hoodie.schema.on.read.enable=true")
             // NOTE: This is required since as this tests use type coercions 
which were only permitted in Spark 2.x
             //       and are disallowed now by default in Spark 3.x
             spark.sql("set spark.sql.storeAssignmentPolicy=legacy")
             val df1 = spark.range(0, 5).toDF("id").withColumn("cc", lit(new 
java.math.BigDecimal("123")))   // decimal(3, 0)
             df1.write.format("hudi").
               options(getQuickstartWriteConfigs).
               option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, tableType).
               option(PRECOMBINE_FIELD_OPT_KEY, "id").
               option(RECORDKEY_FIELD_OPT_KEY, "id").
               option("hoodie.schema.on.read.enable","true").
               option(TABLE_NAME.key(), tableName).
               option("hoodie.table.name", tableName).
               mode("overwrite").
               save(tablePath)
   
             val df2 = spark.range(0, 5).toDF("id").withColumn("cc", lit(new 
java.math.BigDecimal("1234.0")))   // decimal (4, 1)
   
             df2.write.format("hudi").
               options(getQuickstartWriteConfigs).
               option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, tableType).
               option(PRECOMBINE_FIELD_OPT_KEY, "id").
               option(RECORDKEY_FIELD_OPT_KEY, "id").
               option("hoodie.schema.on.read.enable","true").
               option("hoodie.datasource.write.reconcile.schema","true").
               option(TABLE_NAME.key(), tableName).
               option("hoodie.table.name", tableName).
               mode("append").
               save(tablePath)
   
             spark.read.format("hudi").load(tablePath).show(false)
           }
         }
       })
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] xiarixiaoyao commented on issue #8020: [SUPPORT] org.apache.avro.AvroTypeException: Cannot encode decimal with precision 4 as max precision 2

Reply via email to