[
https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-1716:
--------------------------------------
Description:
Looks like realtime view w/ MOR table fails if schema present in existing log
file is evolved to add a new field. no issues w/ writing. but reading fails
More info: [https://github.com/apache/hudi/issues/2675]
Logs from local run:
[https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198]
diff with which above logs were generated:
[https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec]
Steps to reproduce in spark shell:
# create MOR table w/ schema1.
# Ingest (with schema1) until log files are created. // verify via hudi-cli. I
didn't see log files w/ just 1 batch of updates. If not, do multiple rounds
until you see log files.
# create a new schema2 with one new additional field. ingest a batch with
schema2 that updates existing records.
# read entire dataset.
was:
Looks like realtime view w/ MOR table fails if schema present in existing log
file is evolved to add a new field. no issues w/ writing. but reading fails
More info: [https://github.com/apache/hudi/issues/2675]
Logs from local run:
[https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198]
diff with which above logs were generated:
[https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec]
> rt view w/ MOR tables fails after schema evolution
> --------------------------------------------------
>
> Key: HUDI-1716
> URL: https://issues.apache.org/jira/browse/HUDI-1716
> Project: Apache Hudi
> Issue Type: Bug
> Components: Storage Management
> Reporter: sivabalan narayanan
> Priority: Major
> Labels: sev:critical, user-support-issues
> Fix For: 0.9.0
>
>
> Looks like realtime view w/ MOR table fails if schema present in existing log
> file is evolved to add a new field. no issues w/ writing. but reading fails
> More info: [https://github.com/apache/hudi/issues/2675]
>
> Logs from local run:
> [https://gist.github.com/nsivabalan/656956ab313676617d84002ef8942198]
> diff with which above logs were generated:
> [https://gist.github.com/nsivabalan/84dad29bc1ab567ebb6ee8c63b3969ec]
>
> Steps to reproduce in spark shell:
> # create MOR table w/ schema1.
> # Ingest (with schema1) until log files are created. // verify via hudi-cli.
> I didn't see log files w/ just 1 batch of updates. If not, do multiple rounds
> until you see log files.
> # create a new schema2 with one new additional field. ingest a batch with
> schema2 that updates existing records.
> # read entire dataset.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)