Github user wzhfy commented on the issue:
https://github.com/apache/spark/pull/19831
@cloud-fan Yes, Spark doesn't allow user to set (Spark's) statistics
manually.
This PR treats 0 row count of **Hive's stats**, it doesn't affect the logic
for Spark's stats. Besides, Spark currently only use Hive's `totalSize` and
`rawDataSize` when they are > 0. This PR changes the behavior for `rowCount` to
be consistent with that, so I think it's fine. But the title of the PR should
be more specific, i.e. it deals with wrong Hive's statistics (zero rowCount).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]