[
https://issues.apache.org/jira/browse/HIVE-23390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mithun Antony updated HIVE-23390:
---------------------------------
Description:
When `analyze <table>` command was executed from presto to update the stats of
a table for the first time from multiple cluster sharing the same Hive
metastore. Duplicate entry for the same table is inserted to the
`TAB_COL_STATS` table.
This lead to failure executing further `analyze <table>` commands.
{code:java}
Query failed: Multiple entries with same key:
dummy=HiveColumnStatistics{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
max=OptionalLong[1]}], doubleStatistics=Optional.empty,
decimalStatistics=Optional.empty, dateStatistics=Optional.empty,
booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0],
distinctValuesCount=OptionalLong[1]} and
dummy=HiveColumnStatistics{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
max=OptionalLong[1]}], doubleStatistics=Optional.empty,
decimalStatistics=Optional.empty, dateStatistics=Optional.empty,
booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0],
distinctValuesCount=OptionalLong[1]}.
{code}
Duplicate records in the `TAB_COL_STATS`
{code:java}
'7','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'
'11','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'{code}
was:
When `analyze <table>` command was executed from presto to update the stats of
a table for the first time from multiple cluster sharing the same Hive
metastore. Duplicate entry for the same table is inserted to the
`TAB_COL_STATS` table.
This lead to failure executing further `analyze <table>` commands.
Query failed: Multiple entries with same key:
dummy=HiveColumnStatistics\{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
max=OptionalLong[1]}], doubleStatistics=Optional.empty,
decimalStatistics=Optional.empty, dateStatistics=Optional.empty,
booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0],
distinctValuesCount=OptionalLong[1]} and
dummy=HiveColumnStatistics\{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
max=OptionalLong[1]}], doubleStatistics=Optional.empty,
decimalStatistics=Optional.empty, dateStatistics=Optional.empty,
booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0],
distinctValuesCount=OptionalLong[1]}.
Duplicate records in the `TAB_COL_STATS`
'7','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509',
'11','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'.
> Duplicate entry for a table in TAB_COL_STATS
> ---------------------------------------------
>
> Key: HIVE-23390
> URL: https://issues.apache.org/jira/browse/HIVE-23390
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 2.3.4
> Reporter: Mithun Antony
> Priority: Major
>
> When `analyze <table>` command was executed from presto to update the stats
> of a table for the first time from multiple cluster sharing the same Hive
> metastore. Duplicate entry for the same table is inserted to the
> `TAB_COL_STATS` table.
> This lead to failure executing further `analyze <table>` commands.
> {code:java}
> Query failed: Multiple entries with same key:
> dummy=HiveColumnStatistics{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
> max=OptionalLong[1]}], doubleStatistics=Optional.empty,
> decimalStatistics=Optional.empty, dateStatistics=Optional.empty,
> booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
> totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0],
> distinctValuesCount=OptionalLong[1]} and
> dummy=HiveColumnStatistics{integerStatistics=Optional[IntegerStatistics{min=OptionalLong[1],
> max=OptionalLong[1]}], doubleStatistics=Optional.empty,
> decimalStatistics=Optional.empty, dateStatistics=Optional.empty,
> booleanStatistics=Optional.empty, maxValueSizeInBytes=OptionalLong.empty,
> totalSizeInBytes=OptionalLong.empty, nullsCount=OptionalLong[0],
> distinctValuesCount=OptionalLong[1]}.
> {code}
> Duplicate records in the `TAB_COL_STATS`
> {code:java}
> '7','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'
>
> '11','default','dual','dummy','smallint','245671','1','1',NULL,NULL,NULL,NULL,'0','1',NULL,NULL,NULL,NULL,'1588345509'{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)