[ 
https://issues.apache.org/jira/browse/IMPALA-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849072#comment-17849072
 ] 

ASF subversion and git services commented on IMPALA-13102:
----------------------------------------------------------

Commit e35f8183cb1ba069ae00ee93e71451eccd505d0a in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e35f8183c ]

IMPALA-13102: Normalize invalid column stats from HMS

Column stats like numDVs, numNulls in HMS could have arbitrary values.
Impala expects them to be non-negative or -1 for unknown. So loading
tables with invalid stats values (<-1) will fail.

This patch adds logic to normalize the stats values. If the value < -1,
use -1 for it and add corresponding warning logs. Also refactor some
redundant codes in ColumnStats.

Tests:
 - Add e2e test

Change-Id: If6216e3d6e73a529a9b3a8c0ea9d22727ab43f1a
Reviewed-on: http://gerrit.cloudera.org:8080/21445
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Loading tables with illegal stats failed
> ----------------------------------------
>
>                 Key: IMPALA-13102
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13102
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> When the table has illegal stats, e.g. numDVs=-100, Impala can't load the 
> table. So DROP STATS or DROP TABLE can't be perform on the table.
> {code:sql}
> [localhost:21050] default> drop stats alltypes_bak;
> Query: drop stats alltypes_bak
> ERROR: AnalysisException: Failed to load metadata for table: 'alltypes_bak'
> CAUSED BY: TableLoadingException: Failed to load metadata for table: 
> default.alltypes_bak
> CAUSED BY: IllegalStateException: ColumnStats{avgSize_=4.0, 
> avgSerializedSize_=4.0, maxSize_=4, numDistinct_=-100, numNulls_=0, 
> numTrues=-1, numFalses=-1, lowValue=-1, highValue=-1}{code}
> We should allow at least dropping the stats or dropping the table. So user 
> can use Impala to recover the stats.
> Stacktrace in the logs:
> {noformat}
> I0520 08:00:56.661746 17543 jni-util.cc:321] 
> 5343142d1173494f:44dcde8c00000000] 
> org.apache.impala.common.AnalysisException: Failed to load metadata for 
> table: 'alltypes_bak'
>         at 
> org.apache.impala.analysis.Analyzer.resolveTableRef(Analyzer.java:974)
>         at 
> org.apache.impala.analysis.DropStatsStmt.analyze(DropStatsStmt.java:94)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:551)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:498)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2542)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2224)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1985)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:175)
> Caused by: org.apache.impala.catalog.TableLoadingException: Failed to load 
> metadata for table: default.alltypes_bak
> CAUSED BY: IllegalStateException: ColumnStats{avgSize_=4.0, 
> avgSerializedSize_=4.0, maxSize_=4, numDistinct_=-100, numNulls_=0, 
> numTrues=-1, numFalses=-1, lowValue=-1, highValue=-1}
>         at 
> org.apache.impala.catalog.IncompleteTable.loadFromThrift(IncompleteTable.java:162)
>         at org.apache.impala.catalog.Table.fromThrift(Table.java:586)
>         at 
> org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:479)
>         at 
> org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:334)
>         at 
> org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:262)
>         at 
> org.apache.impala.service.FeCatalogManager$CatalogdImpl.updateCatalogCache(FeCatalogManager.java:114)
>         at 
> org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:585)
>         at 
> org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:196)
>         at ========.<Remote stack trace on catalogd>: 
> org.apache.impala.catalog.TableLoadingException: Failed to load metadata for 
> table: default.alltypes_bak
>         at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1318)
>         at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1213)
>         at org.apache.impala.catalog.TableLoader.load(TableLoader.java:145)
>         at 
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:251)
>         at 
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:247)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.IllegalStateException: ColumnStats{avgSize_=4.0, 
> avgSerializedSize_=4.0, maxSize_=4, numDistinct_=-100, numNulls_=0, 
> numTrues=-1, numFalses=-1, lowValue=-1, highValue=-1}
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:512)
>         at 
> org.apache.impala.catalog.ColumnStats.validate(ColumnStats.java:1034)
>         at org.apache.impala.catalog.ColumnStats.update(ColumnStats.java:676)
>         at org.apache.impala.catalog.Column.updateStats(Column.java:73)
>         at 
> org.apache.impala.catalog.FeCatalogUtils.injectColumnStats(FeCatalogUtils.java:183)
>         at org.apache.impala.catalog.Table.loadAllColumnStats(Table.java:513)
>         at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1269)
>         ... 8 more{noformat}
> CC [~VenuReddy] [~hemanth619] [~ngangam]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to