Hi all, In case of reading schema-inferable source like parquet, when no new data is found, then, if i understand correctly, no schema can be inferred, and need not to be.
Seeing this method org.apache.hudi.utilities.sources.InputBatch#getSchemaProvider requiring non-null schemaProvider, and org.apache.hudi.utilities.deltastreamer.DeltaSync#readFromSource calling getSchemaProvider() for all cases, including the no-new-data case, exception will be thrown asking to set schema provider, for even reading from schema-inferable parquet source. I think this is not an ideal case. I had a short draft PR to accept null schema provider in case of no new data https://github.com/apache/incubator-hudi/pull/1584/files I actually prefer another approach of returning Option<SchemaProvider> getSchemaProvider() In case I have misunderstand the logic or use case, I'd like to ask for some feedback on this change. Thank you. Regards, Raymond