Amar1404 commented on issue #8626: URL: https://github.com/apache/hudi/issues/8626#issuecomment-1534104878
hi @danny0405 - I mean if i don't provide any schema class, so while using spark.read.json() is will automatically infer the schema, but in the code we are using spark.read.schema().json(). if we don't provide any schema we don't want to restrict the number of column coming from document db. since in Nosql the column get evolve in the systems.Let say we are ingesting 1000 dataset then either we need to 1000 of files for it or schema registry. and if we want new column to automatically include we need to change the files or registry which is a restriction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
