Jibing-Li commented on code in PR #38990:
URL: https://github.com/apache/doris/pull/38990#discussion_r1706485695
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java:
##########
@@ -187,22 +187,17 @@ public BaseAnalysisTask createAnalysisTask(AnalysisInfo
info) {
@Override
public long fetchRowCount() {
makeSureInitialized();
- try {
- long rowCount = 0;
- Optional<SchemaCacheValue> schemaCacheValue =
getSchemaCacheValue();
- Table paimonTable = schemaCacheValue.map(value ->
((PaimonSchemaCacheValue) value).getPaimonTable())
- .orElse(null);
- if (paimonTable == null) {
- return -1;
- }
- List<Split> splits =
paimonTable.newReadBuilder().newScan().plan().splits();
- for (Split split : splits) {
- rowCount += split.rowCount();
- }
- return rowCount;
- } catch (Exception e) {
- LOG.warn("Fail to collect row count for db {} table {}", dbName,
name, e);
+ long rowCount = 0;
+ Optional<SchemaCacheValue> schemaCacheValue = getSchemaCacheValue();
+ Table paimonTable = schemaCacheValue.map(value ->
((PaimonSchemaCacheValue) value).getPaimonTable())
+ .orElse(null);
+ if (paimonTable == null) {
+ return -1;
+ }
+ List<Split> splits =
paimonTable.newReadBuilder().newScan().plan().splits();
Review Comment:
This could be expensive when the table is large. But it's not related to
this pr, we can try to improve this in a separate pr if needed.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergUtils.java:
##########
@@ -592,22 +592,17 @@ public static List<Column> getSchema(ExternalCatalog
catalog, String dbName, Str
* @return estimated row count
*/
public static long getIcebergRowCount(ExternalCatalog catalog, String
dbName, String tbName) {
- try {
- Table icebergTable = Env.getCurrentEnv()
- .getExtMetaCacheMgr()
- .getIcebergMetadataCache()
- .getIcebergTable(catalog, dbName, tbName);
- Snapshot snapshot = icebergTable.currentSnapshot();
- if (snapshot == null) {
- // empty table
- return 0;
- }
- Map<String, String> summary = snapshot.summary();
- return Long.parseLong(summary.get(TOTAL_RECORDS)) -
Long.parseLong(summary.get(TOTAL_POSITION_DELETES));
- } catch (Exception e) {
- LOG.warn("Fail to collect row count for db {} table {}", dbName,
tbName, e);
+ Table icebergTable = Env.getCurrentEnv()
+ .getExtMetaCacheMgr()
+ .getIcebergMetadataCache()
+ .getIcebergTable(catalog, dbName, tbName);
+ Snapshot snapshot = icebergTable.currentSnapshot();
Review Comment:
I checked the code, the table may be null when the iceberg metadata cache is
not loaded. But I think it's not a problem, because the NPE would be caught in
the caller and return the default value -1. Meanwhile it will trigger iceberg
metadata cache to load the table so we can get it next time. I think we can fix
this in a separate pr if needed. But I feel we don't need to do anything about
it right now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]