Miklos Szurap created HIVE-28124:
------------------------------------

             Summary: Do not allow non-numeric values in Hive table stats 
during an alter table
                 Key: HIVE-28124
                 URL: https://issues.apache.org/jira/browse/HIVE-28124
             Project: Hive
          Issue Type: Bug
          Components: Statistics
    Affects Versions: 3.1.3
            Reporter: Miklos Szurap


Hive table properties are string in their nature, however some of them have 
special meaning and should have numeric values, like the "totalSize", 
"numRows", "rawDataSize". 
During an "ALTER TABLE" statement Hive currently validates only the "numRows" 
and "rawDataSize" table properties, the other table properties can be set to 
non-numeric values (including an empty string).
>From certain applications (like from Spark) we get quite obscure 
>"NumberFormatException" errors while trying to access such broken tables. (see 
>SPARK-47444)
For example such a query (after which that table can't be read from Spark)::
{code}
0: jdbc:hive2://hs2host> alter table t1p set tblproperties('totalSize'='', 
'STATS_GENERATED_VIA_STATS_TASK'='true');
{code}
In the AbstractAlterTablePropertiesAnalyzer.java besides the "numRows" and 
"rawDataSize" we should validate the other table stats related properties too, 
currently the missing ones are:
numFiles, numPartitions, totalSize, runTimeNumRows, numFilesErasureCoded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to