yyanyy commented on a change in pull request #1820:
URL: https://github.com/apache/iceberg/pull/1820#discussion_r552306620



##########
File path: core/src/main/java/org/apache/iceberg/ManifestReader.java
##########
@@ -282,14 +283,18 @@ private static boolean requireStatsProjection(Expression 
rowFilter, Collection<S
 
   static boolean dropStats(Expression rowFilter, Collection<String> columns) {
     // Make sure we only drop all stats if we had projected all stats
-    // We do not drop stats even if we had partially added some stats columns
-    return rowFilter != Expressions.alwaysTrue() &&
-        columns != null &&
-        !columns.containsAll(ManifestReader.ALL_COLUMNS) &&
-        Sets.intersection(Sets.newHashSet(columns), STATS_COLUMNS).isEmpty();
+    // We do not drop stats even if we had partially added some stats columns, 
except for record_count column.
+    // Since we don't want to keep stats map which could be huge in size just 
because we select record_count, which
+    // is a primitive type.
+    if (rowFilter != Expressions.alwaysTrue() && columns != null &&
+        !columns.containsAll(ManifestReader.ALL_COLUMNS)) {
+      Set<String> interaction = Sets.intersection(Sets.newHashSet(columns), 
STATS_COLUMNS);

Review comment:
       Yes, thanks for catching it!

##########
File path: core/src/main/java/org/apache/iceberg/ManifestReader.java
##########
@@ -136,7 +137,7 @@ public PartitionSpec spec() {
     return spec;
   }
 
-  public ManifestReader<F> select(Collection<String> newColumns) {
+  public ManifestReader<F> select(List<String> newColumns) {

Review comment:
       I think I modified this when trying to address [this 
comment](https://github.com/apache/iceberg/pull/1820#discussion_r530610375), as 
I noticed that all usage of `columns` could be done via list so I directly 
changed the type to avoid the list copying, without thinking about backward 
compatibility of this method. I guess your original suggestion was to move the 
copying from `ManifestGroup` to `ManifestReader`, but I misinterpreted it to 
get rid of the list copy completely? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to