Gabor Kaszab created HIVE-19830: ----------------------------------- Summary: Inconsistent behavior when multiple partitions point to the same location Key: HIVE-19830 URL: https://issues.apache.org/jira/browse/HIVE-19830 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.4.0 Reporter: Gabor Kaszab Assignee: Adam Szita
// create a table with 2 partitions where both partitions share the same location and inserting a single line to one of them. create table test (i int) partitioned by (j int) stored as parquet; alter table test add partition (j=1) location 'hdfs://localhost:20500/test-warehouse/test/j=1'; alter table test add partition (j=2) location 'hdfs://localhost:20500/test-warehouse/test/j=1'; insert into table test partition (j=1) values (1); // select * show this single line in both partitions as expected. select * from test; 1 1 1 2 // however, sum() doesn't add up the line for all the partitions. This is +Issue #1+. select sum( i), sum(j) from test; 1 2 // On the file system there is a common dir for the 2 partitions that is expected. hdfs dfs -ls hdfs://localhost:20500/test-warehouse/test/ Found 1 items drwxr-xr-x - gaborkaszab supergroup 0 2018-06-08 10:54 hdfs://localhost:20500/test-warehouse/test/j=1 // Let's drop one of the partitions now! alter table test drop partition (j=2); // running the same hdfs dfs -ls command shows that the j=1 directory is dropped. I think this is a good behavior, we just have to document that this is the expected case. // select * from test; returns zero rows, this is still as expected. // Even though the dir is dropped j=1 partition is still visible with show partitions. This is +Issue #2+. show partitions test; j=1 After dropping the directory with Hive, when Impala reloads it's partitions it asks Hive to tell what are the existing partitions. Apparently, Hive sends down a list with j=1 partition included and then Impala takes it as an existing one and doesn't drop it from Catalog's cache. Here Hive shouldn't send that partition down. This is +Issue #3+. -- This message was sent by Atlassian JIRA (v7.6.3#76005)