[
https://issues.apache.org/jira/browse/SQOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970334#comment-15970334
]
Eric Lin commented on SQOOP-3150:
-
Hi Ankit,
I just did some review on the issue you raised, and I noticed that the
--target-dir is not used to control where the hive table will be created, or
the destination of the target partition data will be stored. Rather, the
--target-dir is used to control ONLY the data that is generated before loading
into Hive table.
For example, you specified --target-dir as
"/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES", so the data will be
stored into this directory and the final Hive query that will import data into
Hive will be something like below:
LOAD DATA INPATH
'hdfs://localhost:9000/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES'
OVERWRITE INTO TABLE `employees_p` PARTITION (date='10-03-2017');
You will have no control of where the final directory that the partition goes
into in Hive.
Hope that makes sense to you. So this is not a bug, but work as expected.
> issue with sqoop hive import with partitions
>
>
> Key: SQOOP-3150
> URL: https://issues.apache.org/jira/browse/SQOOP-3150
> Project: Sqoop
> Issue Type: Bug
> Components: hive-integration
>Affects Versions: 1.4.6
> Environment: Cent-Os
>Reporter: Ankit Kumar
>Assignee: Eric Lin
> Labels: features
>
> Sqoop Command:
> sqoop import \
> ...
> --hive-import \
> --hive-overwrite \
> --hive-table employees_p \
> --hive-partition-key date \
> --hive-partition-value 10-03-2017 \
> --target-dir ..\
> -m 1
>
> hive-table script:
> employees_p is a partitioned table on date(string) column
>
> Issue:-
> Case1: When --target-dir
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES \
> while running above sqoop command, gets an error "directory already
> exissts".
>
> When : --target-dir
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/anyname
> 2. Above sqoop command creates a hive partition (date=10-03-2017) and
> directory as
> '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'
>
> Expected Behaviour:- As in sqoop command --hive-partition-key and
> --hive-partition-value is present, so it should auto create partioned
> directory inside EMPLOYEES.
> ie. '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)