[jira] [Commented] (SQOOP-3150) issue with sqoop hive import with partitions

2017-04-16 Thread Eric Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970334#comment-15970334
 ] 

Eric Lin commented on SQOOP-3150:
-

Hi Ankit,

I just did some review on the issue you raised, and I noticed that the 
--target-dir is not used to control where the hive table will be created, or 
the destination of the target partition data will be stored. Rather, the 
--target-dir is used to control ONLY the data that is generated before loading 
into Hive table.

For example, you specified --target-dir as 
"/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES", so the data will be 
stored into this directory and the final Hive query that will import data into 
Hive will be something like below:

LOAD DATA INPATH 
'hdfs://localhost:9000/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES' 
OVERWRITE INTO TABLE `employees_p` PARTITION (date='10-03-2017');

You will have no control of where the final directory that the partition goes 
into in Hive.

Hope that makes sense to you. So this is not a bug, but work as expected.

> issue with sqoop hive import with partitions
> 
>
> Key: SQOOP-3150
> URL: https://issues.apache.org/jira/browse/SQOOP-3150
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
> Environment: Cent-Os
>Reporter: Ankit Kumar
>Assignee: Eric Lin
>  Labels: features
>
> Sqoop Command:
>   sqoop import \
>   ...
>   --hive-import  \
>   --hive-overwrite  \
>   --hive-table employees_p  \
>   --hive-partition-key date  \
>   --hive-partition-value 10-03-2017  \
>   --target-dir ..\
>   -m 1  
>   
>   hive-table script:
>   employees_p is a partitioned table on date(string) column
>   
>   Issue:- 
>   Case1: When  --target-dir 
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES \
>   while running above sqoop command, gets an error "directory already 
> exissts".
>   
>   When : --target-dir 
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/anyname 
>   2. Above sqoop command creates a hive partition (date=10-03-2017) and 
> directory as
>   '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'
>   
> Expected Behaviour:- As in sqoop command  --hive-partition-key and  
> --hive-partition-value is present, so it should auto create partioned 
> directory inside EMPLOYEES.
> ie. '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SQOOP-3150) issue with sqoop hive import with partitions

2017-04-15 Thread Eric Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969934#comment-15969934
 ] 

Eric Lin commented on SQOOP-3150:
-

Will try to re-produce the issue first using latest sqoop code.

> issue with sqoop hive import with partitions
> 
>
> Key: SQOOP-3150
> URL: https://issues.apache.org/jira/browse/SQOOP-3150
> Project: Sqoop
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 1.4.6
> Environment: Cent-Os
>Reporter: Ankit Kumar
>Assignee: Eric Lin
>  Labels: features
>
> Sqoop Command:
>   sqoop import \
>   ...
>   --hive-import  \
>   --hive-overwrite  \
>   --hive-table employees_p  \
>   --hive-partition-key date  \
>   --hive-partition-value 10-03-2017  \
>   --target-dir ..\
>   -m 1  
>   
>   hive-table script:
>   employees_p is a partitioned table on date(string) column
>   
>   Issue:- 
>   Case1: When  --target-dir 
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES \
>   while running above sqoop command, gets an error "directory already 
> exissts".
>   
>   When : --target-dir 
> /user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/anyname 
>   2. Above sqoop command creates a hive partition (date=10-03-2017) and 
> directory as
>   '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'
>   
> Expected Behaviour:- As in sqoop command  --hive-partition-key and  
> --hive-partition-value is present, so it should auto create partioned 
> directory inside EMPLOYEES.
> ie. '/user/hdfs/landing/staging/Hive/partitioned/EMPLOYEES/date=10-03-2017'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)