kasakrisz commented on code in PR #4000:
URL: https://github.com/apache/hive/pull/4000#discussion_r1092068769
##########
ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java:
##########
@@ -277,23 +277,31 @@ private static Statistics collectStatistics(HiveConf
conf, PrunedPartitionList p
boolean metaTable = table.getMetaTable() != null;
if (!table.isPartitioned()) {
+ long ds, nr, fs;
+ if (table.isNonNative() &&
table.getStorageHandler().canProvideBasicStatistics()) {
+ Map<String, String> icebergBasicStatMap =
table.getStorageHandler().getBasicStatistics(Partish.buildFor(table));
+ ds = Long.parseLong(icebergBasicStatMap.get("totalSize"));
+ nr = Long.parseLong(icebergBasicStatMap.get("numRows"));
+ fs = Long.parseLong(icebergBasicStatMap.get("numFiles"));
Review Comment:
1. It seems to me that it check if the currentSnapshot exists or not but
what if summary does not contain for example
`SnapshotSummary.TOTAL_DATA_FILES_PROP`.
2. The line
```
Long.parseLong(icebergBasicStatMap.get("totalSize"))
```
can throw `NumberFormatException` because
* `icebergBasicStatMap.get("totalSize")` returns null because the map does
have an entry with `totalSize`
* or it has such entry but the value is not a valid `Long` value.
3. I think the name `icebergBasicStatMap` is too specific here because at
this point we should not expect an `IcebergStorageHandler` but any
`HiveStorageHandler` which supports providing BasicStatistics. So a name like
`basicStatMap` would be more generic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]