[ 
https://issues.apache.org/jira/browse/HUDI-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wong updated HUDI-628:
-----------------------------
    Summary: MultiPartKeysValueExtractor does not work with run_sync_tool.sh  
(was: MultiPartKeysValueExtractor does not work with 
HoodieHiveClient.getPartitionClause)

> MultiPartKeysValueExtractor does not work with run_sync_tool.sh
> ---------------------------------------------------------------
>
>                 Key: HUDI-628
>                 URL: https://issues.apache.org/jira/browse/HUDI-628
>             Project: Apache Hudi (incubating)
>          Issue Type: Bug
>            Reporter: Andrew Wong
>            Priority: Major
>         Attachments: stack_trace.txt
>
>
> The [https://hudi.apache.org/docs/quick-start-guide.html] example data has a 
> column `partitionpath` which holds values like `americas/brazil/sao_paulo`. 
> Using the docker environment, you can change the basePath from the quickstart 
> to save to hdfs://user/hive/warehouse/hudi_trips_cow. Then you can see the 
> folder in the HDFS browser, similar to the stock_ticks_cow folder created in 
> the docker demo.
> However, if you try to use run_sync_tool.sh to sync the table to Hive, you 
> get the error: "java.lang.IllegalArgumentException: Partition key parts 
> [partitionpath] does not match with partition values [americas, brazil, 
> sao_paulo]. Check partition strategy. "
> {quote}{{/var/hoodie/ws/hudi-hive/run_sync_tool.sh --jdbc-url 
> jdbc:hive2://hiveserver:10000 --user hive --pass hive --partitioned-by 
> partitionpath --partition-value-extractor 
> org.apache.hudi.hive.MultiPartKeysValueExtractor -MultiPartKeysValueExtractor 
> -base-path /user/hive/warehouse/hudi_trips_cow --database default --table 
> hudi_trips_cow}}
> {quote}
> This error is thrown in `HoodieHiveClient.getPartitionClause`, which uses 
> `extractPartitionValuesInPath` to get a list of partitionValues. The problem 
> is that it compares the length of the partitionValues to the length of the 
> partitionField. In this example, there is only 1 partitionField, 
> "partitionpath," which is split into 3 partitionValues. Thus the check fails 
> and throws the exception. 
> See 
> [https://github.com/apache/incubator-hudi/blob/master/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java#L182]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to