nsivabalan opened a new pull request #4509:
URL: https://github.com/apache/hudi/pull/4509
## What is the purpose of the pull request
When a table created via deltastreamer has only one commit which is empty,
there are chances that there is not schema (depending on how schema provider is
set).
In such cases, if someone tries to do incremental read from this table, the
commit metadata may not have any schema and hence results in NPE.
## Brief change log
Fixed Incremental relation to return empty RDD on such cases.
## Verify this pull request
- I could not reproduce this locally as I tried w/ parquet DFS and used
FileBasedSchemaProvider and so schema was populated and hence incremental query
return empty dataframe. Will try to poke around to validate the fix.
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under
an umbrella JIRA.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]