Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/24153 )
Change subject: IMPALA-14597: Initial HBO support ...................................................................... Patch Set 3: (4 comments) Tried to get the big picture, only went to details about the storage part. http://gerrit.cloudera.org:8080/#/c/24153/3/common/thrift/HBO.thrift File common/thrift/HBO.thrift: http://gerrit.cloudera.org:8080/#/c/24153/3/common/thrift/HBO.thrift@34 PS3, Line 34: TPlanNodeRun If this is cached in deserialized format, then it seems better to me to use required fields with some default value to avoid the extra "set" members. Also, if this is not written to files, why not use simple java objects? http://gerrit.cloudera.org:8080/#/c/24153/3/fe/src/main/java/org/apache/impala/service/HistoricalStats.java File fe/src/main/java/org/apache/impala/service/HistoricalStats.java: http://gerrit.cloudera.org:8080/#/c/24153/3/fe/src/main/java/org/apache/impala/service/HistoricalStats.java@124 PS3, Line 124: // TODO: handle races from concurrent writers. This doesn't look hard using asMap().compute() on the map. As the serialization is fast, we don't need to worry much about blocking. http://gerrit.cloudera.org:8080/#/c/24153/3/fe/src/main/java/org/apache/impala/service/InMemoryCacheBackend.java File fe/src/main/java/org/apache/impala/service/InMemoryCacheBackend.java: http://gerrit.cloudera.org:8080/#/c/24153/3/fe/src/main/java/org/apache/impala/service/InMemoryCacheBackend.java@73 PS3, Line 73: weighHistoricalStatsValue It seems simpler to store the stats in serialized format, which would make weighing trivial. Or that would add significant overhead? http://gerrit.cloudera.org:8080/#/c/24153/3/fe/src/main/java/org/apache/impala/service/InMemoryCacheBackend.java@97 PS3, Line 97: this(4, 1024L * 1024 * 1024); This default seems too large to me. The danger I see is that the JVM limit is not large enough, then this will very slowly fill up and lead to hard to diagnose issues. My preference would be to have a unified size limited cache in the coordinator (it doesn't seem hard to move the cache implementation itself out of CatalogMetaProvider). Another way would be to use soft references here. -- To view, visit http://gerrit.cloudera.org:8080/24153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6ff60a8bd22c13c0ecad1198934cc96249b1015e Gerrit-Change-Number: 24153 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 31 Mar 2026 15:05:45 +0000 Gerrit-HasComments: Yes
