nsivabalan edited a comment on pull request #2654: URL: https://github.com/apache/hudi/pull/2654#issuecomment-801212605
@n3nash @vinothchandar : Looks like we can't really allow NullSchemaProvider in our delta schema flow. We have [SparkAvroPostProcessor](https://github.com/apache/hudi/blob/968488fa3a3c67962ed5b60e00836e71c730a9a5/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SparkAvroPostProcessor.java#L42) which will try to convert schema to spark sql schema and back. So, if user passed in Null schema (Schema.create(Schema.Type.NULL)), we will run into NPE issues. So, re-cap. - if entire schema provider is null, we are good. if schema provider returns null for schema (getTargetSchema() == null), we fallback to RowBasedSchemaProvider. - but if users passes in a NullSchemaProvider which will return Schema.create(Schema.Type.NULL) for target schema, I am not sure if we should make fixes to allow this flow. Let me know if this sounds reasonable. we can close this out and inform customer to not set any schema provider only. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
