Andrew Wong created HUDI-628:
--------------------------------
Summary: MultiPartKeysValueExtractor does not work with
HoodieHiveClient.getPartitionClause
Key: HUDI-628
URL: https://issues.apache.org/jira/browse/HUDI-628
Project: Apache Hudi (incubating)
Issue Type: Bug
Reporter: Andrew Wong
The [https://hudi.apache.org/docs/quick-start-guide.html] example data has a
column `partitionpath` which holds values like `americas/brazil/sao_paulo`.
Using the docker environment, you can change the basePath from the quickstart
to save to hdfs://user/hive/warehouse/hudi_trips_cow. Then you can see the
folder in the HDFS browser, similar to the stock_ticks_cow folder created in
the docker demo.
However, if you try to use run_sync_tool.sh to sync the table to Hive, you get
the error: "java.lang.IllegalArgumentException: Partition key parts
[partitionpath] does not match with partition values [americas, brazil,
sao_paulo]. Check partition strategy. "
{quote}{{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url
jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by
partitionpath --partition-value-extractor
org.apache.hudi.hive.MultiPartKeysValueExtractor -MultiPartKeysValueExtractor
-base-path /user/hive/warehouse/hudi_trips_cow --database default --table
hudi_trips_cow}}
{quote}
This error is thrown in `HoodieHiveClient.getPartitionClause`, which uses
`extractPartitionValuesInPath` to get a list of partitionValues. The problem is
that it compares the length of the partitionValues to the length of the
partitionField. In this example, there is only 1 partitionField,
"partitionpath," which is split into 3 partitionValues. Thus the check fails
and throws the exception.
See
[https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L182]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)