Manoj Govindassamy created HUDI-3301:
----------------------------------------
Summary: Metadata table inline reading should be stateless and
thread safe
Key: HUDI-3301
URL: https://issues.apache.org/jira/browse/HUDI-3301
Project: Apache Hudi
Issue Type: Task
Reporter: Manoj Govindassamy
Assignee: Ethan Guo
Fix For: 0.11.0
Metadata table inline reading (enable.full.scan.log.files = false) today alters
instance member fields and not thread safe.
When the inline reading is enabled, HoodieMetadataMergedLogRecordReader doesn't
do full read of log and base files and doesn't fill in the ExternalSpillableMap
records cache. Each getRecordsByKeys() thereby will re-read the log and base
files by design. But the issue here is this reading alters the instance members
and the filled in records are relevant only for that request. Any concurrent
getRecordsByKeys() is also modifying the member variable leading to NPE.
To avoid this, a temporary fix of making getRecordsByKeys() a synchronized
method has been pushed to master. But this fix doesn't solve all usecases. We
need to make the whole class stateless and thread safe for inline reading.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)