BELUGA BEHR created HIVE-16868: ---------------------------------- Summary: Query Hint For Primary Key / Foreign Key Joins Key: HIVE-16868 URL: https://issues.apache.org/jira/browse/HIVE-16868 Project: Hive Issue Type: New Feature Components: Physical Optimizer Affects Versions: 2.1.1, 3.0.0 Reporter: BELUGA BEHR Priority: Minor
{code:title=org.apache.hadoop.hive.ql.stats.StatsUtils.java|borderStyle=solid} /** * Based on the provided column statistics and number of rows, this method infers if the column * can be primary key. It checks if the difference between the min and max value is equal to * number of rows specified. * @param numRows - number of rows * @param colStats - column statistics */ public static void inferAndSetPrimaryKey(long numRows, List<ColStatistics> colStats) { if (colStats != null) { for (ColStatistics cs : colStats) { if (cs != null && cs.getCountDistint() >= numRows) { cs.setPrimaryKey(true); } else if (cs != null && cs.getRange() != null && cs.getRange().minValue != null && cs.getRange().maxValue != null) { if (numRows == ((cs.getRange().maxValue.longValue() - cs.getRange().minValue.longValue()) + 1)) { cs.setPrimaryKey(true); } } } } } {code} This code is likely to miss many PK key scenarios because users may delete rows from their tables over time and cause this to miss. {code} PK Values: 1,2,4 Range = ( 3 +1 ) = 4 Rows = 3 {code} Allow a query hint that can be used by the user to specify a join as a PK-FK relationship. -- This message was sent by Atlassian JIRA (v6.3.15#6346)