[ 
https://issues.apache.org/jira/browse/SQOOP-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283145#comment-13283145
 ] 

Cheolsoo Park commented on SQOOP-489:
-------------------------------------

Thanks for filing a jira Kate.

To fix the issue, I suggest that we exclude the column that is specified by 
partition key from the column definitions if any partition key is specified.

i.e.
{code}
CREATE TABLE IF NOT EXISTS `FOO` (`I` STRING, `J` STRING) PARTITIONED BY (I 
STRING)
{code}

=>

{code}
CREATE TABLE IF NOT EXISTS `FOO` (`J` STRING) PARTITIONED BY (I STRING)
{code}

I also thought about adding the --columns option to hive-create-table, but this 
seems less preferable because this will force the user to specify all the 
columns except partition key in the command to import table. 

Please let me know if anyone has a better suggestion.

Another note that I'd like to add is that in my experiment, the current query 
sometimes works in Hive while it sometimes doesn't. So you might find it work 
for you. Nevertheless, I confirmed with a Hive developer that this should be a 
syntax error, so we should not rely on this Hive behavior.
                
> Cannot define partition keys for Hive tables created through Sqoop
> ------------------------------------------------------------------
>
>                 Key: SQOOP-489
>                 URL: https://issues.apache.org/jira/browse/SQOOP-489
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.1-incubating
>            Reporter: Kathleen Ting
>
> By enabling the table option, Sqoop includes every column in the table in the 
> create table query, and by enabling the hive-partition-key option, Sqoop 
> blindly appends the "partitioned by" clause. Now if you specify one of 
> columns in the table in the hive-partition-key, this will cause a syntax 
> error in Hive.
> For example, if we have a table 'FOO' that has columns 'I' and 'J':
> sqoop create-hive-table --table FOO ...
> will generate the following Hive query:
> CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING)
> Now if we add "--hive-partition-key I" to the command, Sqoop generates the 
> following query:
> CREATE TABLE IF NOT EXISTS `FOO` ( `I` STRING, `J` STRING) PARTITIONED BY (I 
> STRING)
> The problem is that since 'I' is defined twice (once in CRATE TABLE and once 
> in PARTITIONED BY), this is a syntax error in Hive.
> This correct query would be something like:
> CREATE TABLE IF NOT EXISTS `FOO` (`J` STRING) PARTITIONED BY (I STRING)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to