[
https://issues.apache.org/jira/browse/HIVE-22088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907865#comment-16907865
]
Hui An commented on HIVE-22088:
-------------------------------
The issue is caused by Method Warehouse.makeSpecFromName in
Warehouse.java(branch2-3)
{code:java}
// Warehouse.java
public static void makeSpecFromName(Map<String, String> partSpec, Path
currPath) {
List<String[]> kvs = new ArrayList<String[]>();
do {
String component = currPath.getName();
Matcher m = pat.matcher(component);
if (m.matches()) {
String k = unescapePathName(m.group(1));
String v = unescapePathName(m.group(2));
String[] kv = new String[2];
kv[0] = k;
kv[1] = v;
kvs.add(kv);
}
currPath = currPath.getParent();
} while (currPath != null && !currPath.getName().isEmpty());
// reverse the list since we checked the part from leaf dir to table's base
dir
for (int i = kvs.size(); i > 0; i--) {
partSpec.put(kvs.get(i - 1)[0], kvs.get(i - 1)[1]);
}
}
{code}
This method will generate partition specification by recursively matching
currPath and its parents until currPath is empty with pattern
_pat_("*([^/]+)=([^/]+*)"), so it will add "customer=abc" to partSpec and
return. After that, method loadPartition in Hive.java will use this to get
oldPart, and error will throw out(it will call method validatePartColumnNames
to check if column is valid).
> Dynamic partition insert problem on external table with "=" in location path
> spec
> ---------------------------------------------------------------------------------
>
> Key: HIVE-22088
> URL: https://issues.apache.org/jira/browse/HIVE-22088
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.3.4
> Environment: Hive 2.6.0.10-2 Executing on Tez.
> OS: Ubuntu 16.04.4 LTS
> Config settings used:
> SET hive.exec.dynamic.partition=true;
> SET hive.exec.dynamic.partition.mode=nonstrict;
> Reporter: Puneet Khatod
> Assignee: Hui An
> Priority: Major
>
> If external table location spec has a '=' sign (coincidentally partition
> specifier) in it, then dynamic partition loading fails.
> *Use cases:*
> Quite often the same data is used in different contexts by creating different
> external tables on top of the data. Many times the tables have different
> partition depths depending on how data is organized.
> Like in below example, there are individual customer specific tables and
> queries/jobs to insert data partitioned by type. And there is another table
> to give the consolidated data view of all the customers, thus have two level
> partition customer and type.
> The job to insert customer specific data into customer specific table fails
> if we use dynamic partitioning. Static partition insert on same table works
> fine though.
> *Replication:*
> To replicate following simple setup could be done. Below execution is on
> 'Tez'.
> *Source table**-*
> CREATE EXTERNAL TABLE temp_dummy_table
> (id STRING, type STRING)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
> STORED AS TEXTFILE
> LOCATION '/home/source/';
>
> *Destination Table-*
> CREATE EXTERNAL TABLE temp_dummy_dest_table
> (id STRING)
> PARTITIONED BY (type string)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n'
> STORED AS TEXTFILE
> LOCATION '/home/destination/{color:#ff0000}customer=abc{color}/';
>
> *Insert into destination-*
> insert overwrite table temp_dummy_dest_table partition (type)
> select i.id as id, i.type as type
> from temp_dummy_table i
> where i.type in ('type1','type2');
>
> *Log and Error Msgs on CLI*-
> Loading data to table temp_dummy_dest_table partition (type=null)
> Failed with exception Partition spec \{type=type1, customer=abc} contains
> non-partition columns
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.MoveTask
>
> *Possible resolution:*
> The dynamic partitioning should consider only those partition specs which are
> under the defined table root/base path. If the path itself has partition
> style format (customer=abc in above example) then that should not be
> considered as partition as it is outside the scope of the table.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)