[ 
https://issues.apache.org/jira/browse/SQOOP-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Benes updated SQOOP-2471:
-------------------------------
    Description: 
Currently sqoop import is not able to handle any complex type. On the other 
side the hive already has support for the following complex types:

 - arrays: ARRAY<data_type>
 - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>

Since it is probably not possible to obtain all necessary information about 
those types from general JDBC database, this feature should somehow use an 
external information provided by arguments --map-column-java and 
--map-column-hive. 

For example it could look like this:
 --map-column-java item='inventory_item(name text, supplier_id integer,price 
numeric)'
 --map-column-hive item='STRUCT<name : string, supplier_id : int, price : 
decimal>'

In case no additional information is provided some more general type should be 
created if possible.

It should be possible to serialize the complex datatypes values into strings 
when the Hive target column's type is explicitly set to 'STRING'. 

  was:
Currently sqoop import is not able to handle any complex type. On the other 
side the hive already has support for the following complex types:

 - arrays: ARRAY<data_type>
 - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
 - maps: MAP<primitive_type, data_type>
 - union: UNIONTYPE<data_type, data_type, ...> 

the most frequent/important is probably the ARRAY type followed by the STRUCT 
type. 

Since it is probably not possible to obtain all necessary information about 
those types from general JDBC database, this feature should somehow use an 
external information provided by arguments --map-column-java and 
--map-column-hive. 

For example it could look like this:
 --map-column-java item='inventory_item(name text, supplier_id integer,price 
numeric)'
 --map-column-hive item='STRUCT<name : string, supplier_id : int, price : 
decimal>'

In case no additional information is provided some more general type should be 
created if possible.

It should be possible to serialize the complex datatypes values into strings 
when the Hive target column's type is explicitly set to 'STRING'. 


> Support arrays and structs datatypes with Sqoop Hcatalog integration
> --------------------------------------------------------------------
>
>                 Key: SQOOP-2471
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2471
>             Project: Sqoop
>          Issue Type: New Feature
>          Components: hive-integration
>    Affects Versions: 1.4.6
>            Reporter: Pavel Benes
>            Priority: Critical
>
> Currently sqoop import is not able to handle any complex type. On the other 
> side the hive already has support for the following complex types:
>  - arrays: ARRAY<data_type>
>  - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
> Since it is probably not possible to obtain all necessary information about 
> those types from general JDBC database, this feature should somehow use an 
> external information provided by arguments --map-column-java and 
> --map-column-hive. 
> For example it could look like this:
>  --map-column-java item='inventory_item(name text, supplier_id integer,price 
> numeric)'
>  --map-column-hive item='STRUCT<name : string, supplier_id : int, price : 
> decimal>'
> In case no additional information is provided some more general type should 
> be created if possible.
> It should be possible to serialize the complex datatypes values into strings 
> when the Hive target column's type is explicitly set to 'STRING'. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to