Limess opened a new issue #4031: URL: https://github.com/apache/hudi/issues/4031
**Describe the problem you faced** It is currently possible to set `_hoodie_is_deleted` to a not-null, non `true` value. In this scenario, the column is written to the target table. At this point it's included in the schema, which seems to be undesirable in all cases. **To Reproduce** Steps to reproduce the behavior: 1. Create a hudi table 2. Upsert a record with `_hoodie_is_deleted="some-string" 3. Observe that the record is written to the underlying Hudi table **Expected behavior** The string value should be treated as truthy and the record should be ignored/deleted in the target table. I can't see any scenario where you would want to populate this column. **Environment Description** EMR 6.4.0 * Hudi version: 0.9.0 * Spark version : 3.1.2 * Hive version : Hive 3.1.2 * Hadoop version : Amazon 3.2.1 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
