jonvex commented on code in PR #11935:
URL: https://github.com/apache/hudi/pull/11935#discussion_r1757257568
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##########
@@ -190,18 +193,19 @@ public Schema getSchema() {
@Override
public void close() {
- try {
- if (sharedHFileReader.isPresent()) {
- sharedHFileReader.get().close();
- }
- } catch (IOException e) {
- throw new HoodieIOException("Error closing the HFile reader", e);
- }
+ metaInfoMap.clear();
Review Comment:
How many keys do we expect in the map. Clear nulls out all the entries but
doesn't reduce size. That might be fine if it's only in the thousands. But if
it's going to be a lot then maybe we make this not final and do metaInfoMap =
null so it can be GCd
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##########
@@ -190,18 +193,19 @@ public Schema getSchema() {
@Override
public void close() {
- try {
- if (sharedHFileReader.isPresent()) {
- sharedHFileReader.get().close();
- }
- } catch (IOException e) {
- throw new HoodieIOException("Error closing the HFile reader", e);
- }
+ metaInfoMap.clear();
Review Comment:
Oh nevermind PRELOADED_META_INFO_KEYS is the size of the map.
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##########
@@ -244,14 +248,32 @@ private static GenericRecord
getRecordFromKeyValue(KeyValue keyValue,
readerSchema);
}
- private synchronized HFileReader getSharedHFileReader() {
- try {
- if (!sharedHFileReader.isPresent()) {
- sharedHFileReader = Option.of(newHFileReader());
+ private byte[] getHFileMetaInfoFromCache(String key) throws IOException {
+ if (!PRELOADED_META_INFO_KEYS.contains(key)) {
+ throw new
IllegalStateException("HoodieNativeAvroHFileReader#getHFileMetaInfoFromCache"
+ + " should only be called on supported meta info keys; this key is
not supported: "
+ + key);
+ }
+ byte[] bytes = metaInfoMap.get(key);
Review Comment:
I think it would be better to check if the map exists at all or not. We
might reread a bunch of times if
```
Option<byte[]> metaInfo = reader.getMetaInfo(new UTF8StringKey(metaInfoKey));
```
is ever option.empty
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##########
@@ -244,14 +248,32 @@ private static GenericRecord
getRecordFromKeyValue(KeyValue keyValue,
readerSchema);
}
- private synchronized HFileReader getSharedHFileReader() {
- try {
- if (!sharedHFileReader.isPresent()) {
- sharedHFileReader = Option.of(newHFileReader());
+ private byte[] getHFileMetaInfoFromCache(String key) throws IOException {
+ if (!PRELOADED_META_INFO_KEYS.contains(key)) {
+ throw new
IllegalStateException("HoodieNativeAvroHFileReader#getHFileMetaInfoFromCache"
+ + " should only be called on supported meta info keys; this key is
not supported: "
+ + key);
+ }
+ byte[] bytes = metaInfoMap.get(key);
Review Comment:
Can we do a compute if absent here?
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##########
@@ -244,14 +248,32 @@ private static GenericRecord
getRecordFromKeyValue(KeyValue keyValue,
readerSchema);
}
- private synchronized HFileReader getSharedHFileReader() {
- try {
- if (!sharedHFileReader.isPresent()) {
- sharedHFileReader = Option.of(newHFileReader());
+ private byte[] getHFileMetaInfoFromCache(String key) throws IOException {
Review Comment:
I think we also want to put synchronized on this method:
https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html.
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##########
@@ -244,14 +248,32 @@ private static GenericRecord
getRecordFromKeyValue(KeyValue keyValue,
readerSchema);
}
- private synchronized HFileReader getSharedHFileReader() {
- try {
- if (!sharedHFileReader.isPresent()) {
- sharedHFileReader = Option.of(newHFileReader());
+ private byte[] getHFileMetaInfoFromCache(String key) throws IOException {
+ if (!PRELOADED_META_INFO_KEYS.contains(key)) {
+ throw new
IllegalStateException("HoodieNativeAvroHFileReader#getHFileMetaInfoFromCache"
+ + " should only be called on supported meta info keys; this key is
not supported: "
+ + key);
+ }
+ byte[] bytes = metaInfoMap.get(key);
+ if (bytes != null) {
+ return bytes;
+ }
+ loadAllMetaInfoIntoCache();
+ return metaInfoMap.get(key);
+ }
+
+ private synchronized void loadAllMetaInfoIntoCache() throws IOException {
+ // Load all meta info that are small into cache
Review Comment:
I think there also might need to be a check here to see if the cache is
loaded and to skip reloading
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]