[ https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan updated HUDI-1716: -------------------------------------- Description: Looks like realtime view w/ MOR table fails if schema present in existing log file is evolved to add a new field. no issues w/ writing. but reading fails More info: [https://github.com/apache/hudi/issues/2675] Logs from local run: [https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198] diff with which above logs were generated: [https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec] Steps to reproduce in spark shell: # create MOR table w/ schema1. # Ingest (with schema1) until log files are created. // verify via hudi-cli. I didn't see log files w/ just 1 batch of updates. If not, do multiple rounds until you see log files. # create a new schema2 with one new additional field. ingest a batch with schema2 that updates existing records. # read entire dataset. was: Looks like realtime view w/ MOR table fails if schema present in existing log file is evolved to add a new field. no issues w/ writing. but reading fails More info: [https://github.com/apache/hudi/issues/2675] Logs from local run: [https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198] diff with which above logs were generated: [https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec] > rt view w/ MOR tables fails after schema evolution > -------------------------------------------------- > > Key: HUDI-1716 > URL: https://issues.apache.org/jira/browse/HUDI-1716 > Project: Apache Hudi > Issue Type: Bug > Components: Storage Management > Reporter: sivabalan narayanan > Priority: Major > Labels: sev:critical, user-support-issues > Fix For: 0.9.0 > > > Looks like realtime view w/ MOR table fails if schema present in existing log > file is evolved to add a new field. no issues w/ writing. but reading fails > More info: [https://github.com/apache/hudi/issues/2675] > > Logs from local run: > [https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198] > diff with which above logs were generated: > [https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec] > > Steps to reproduce in spark shell: > # create MOR table w/ schema1. > # Ingest (with schema1) until log files are created. // verify via hudi-cli. > I didn't see log files w/ just 1 batch of updates. If not, do multiple rounds > until you see log files. > # create a new schema2 with one new additional field. ingest a batch with > schema2 that updates existing records. > # read entire dataset. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)