Prashant Wason created HUDI-6092:
------------------------------------
Summary: Reuse schema objects while reading large number of log
blocks
Key: HUDI-6092
URL: https://issues.apache.org/jira/browse/HUDI-6092
Project: Apache Hudi
Issue Type: Improvement
Reporter: Prashant Wason
Assignee: Prashant Wason
Some log files may contain a large number of log blocks. When such a log file
is read, for each block the schema string is read from the log block header and
parsed into the Schema object. The schema string will most probably be the same
and hence parsing it again and again created overhead of parsing as well as
memory overhead.
An optimization is to cache the parsed schema objects.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)