[ 
https://issues.apache.org/jira/browse/HIVE-24928?focusedWorklogId=577489&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-577489
 ]

ASF GitHub Bot logged work on HIVE-24928:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Apr/21 11:20
            Start Date: 06/Apr/21 11:20
    Worklog Time Spent: 10m 
      Work Description: lcspinter commented on a change in pull request #2111:
URL: https://github.com/apache/hive/pull/2111#discussion_r607761504



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java
##########
@@ -119,16 +135,92 @@ public String getName() {
     return "STATS-NO-JOB";
   }
 
-  static class StatItem {
-    Partish partish;
-    Map<String, String> params;
-    Object result;
+  abstract static class StatCollector implements Runnable {
+
+    protected Partish partish;
+    protected Object result;
+    protected LogHelper console;
+
+    public static Function<StatCollector, String> SIMPLE_NAME_FUNCTION =
+        sc -> String.format("%s#%s", 
sc.partish().getTable().getCompleteName(), sc.partish().getPartishType());
+
+    public static Function<StatCollector, Partition> EXTRACT_RESULT_FUNCTION = 
input -> (Partition) input.result();
+
+    abstract Partish partish();
+    abstract boolean isValid();
+    abstract Object result();
+    abstract void init(HiveConf conf, LogHelper console) throws IOException;
+
+    protected String toString(Map<String, String> parameters) {
+      StringBuilder builder = new StringBuilder();

Review comment:
       Again, legacy code :), but I changed it. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 577489)
    Time Spent: 4.5h  (was: 4h 20m)

> In case of non-native tables use basic statistics from HiveStorageHandler
> -------------------------------------------------------------------------
>
>                 Key: HIVE-24928
>                 URL: https://issues.apache.org/jira/browse/HIVE-24928
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 4.0.0
>            Reporter: László Pintér
>            Assignee: László Pintér
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When we are running `ANALYZE TABLE ... COMPUTE STATISTICS` or `ANALYZE TABLE 
> ... COMPUTE STATISTICS FOR COLUMNS` all the basic statistics are collected by 
> the BasicStatsTask class. This class tries to estimate the statistics by 
> scanning the directory of the table. 
> In the case of non-native tables (iceberg, hbase), the table directory might 
> contain metadata files as well, which would be counted by the BasicStatsTask 
> when calculating basic stats. 
> Instead of having this logic, the HiveStorageHandler implementation should 
> provide basic statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to