[
https://issues.apache.org/jira/browse/SQOOP-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Benes updated SQOOP-2471:
-------------------------------
Description:
Currently sqoop import is not able to handle any complex type. On the other
side the hive already has support for the following complex types:
- arrays: ARRAY<data_type>
- structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
Since it is probably not possible to obtain all necessary information about
those types from general JDBC database, this feature should somehow use an
external information provided by arguments --map-column-java and
--map-column-hive.
For example it could look like this:
--map-column-java item='inventory_item(name text, supplier_id integer,price
numeric)'
--map-column-hive item='STRUCT<name : string, supplier_id : int, price :
decimal>'
In case no additional information is provided some more general type should be
created if possible.
It should be possible to serialize the complex datatypes values into strings
when the Hive target column's type is explicitly set to 'STRING'.
was:
Currently sqoop import is not able to handle any complex type. On the other
side the hive already has support for the following complex types:
- arrays: ARRAY<data_type>
- structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
- maps: MAP<primitive_type, data_type>
- union: UNIONTYPE<data_type, data_type, ...>
the most frequent/important is probably the ARRAY type followed by the STRUCT
type.
Since it is probably not possible to obtain all necessary information about
those types from general JDBC database, this feature should somehow use an
external information provided by arguments --map-column-java and
--map-column-hive.
For example it could look like this:
--map-column-java item='inventory_item(name text, supplier_id integer,price
numeric)'
--map-column-hive item='STRUCT<name : string, supplier_id : int, price :
decimal>'
In case no additional information is provided some more general type should be
created if possible.
It should be possible to serialize the complex datatypes values into strings
when the Hive target column's type is explicitly set to 'STRING'.
> Support arrays and structs datatypes with Sqoop Hcatalog integration
> --------------------------------------------------------------------
>
> Key: SQOOP-2471
> URL: https://issues.apache.org/jira/browse/SQOOP-2471
> Project: Sqoop
> Issue Type: New Feature
> Components: hive-integration
> Affects Versions: 1.4.6
> Reporter: Pavel Benes
> Priority: Critical
>
> Currently sqoop import is not able to handle any complex type. On the other
> side the hive already has support for the following complex types:
> - arrays: ARRAY<data_type>
> - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
> Since it is probably not possible to obtain all necessary information about
> those types from general JDBC database, this feature should somehow use an
> external information provided by arguments --map-column-java and
> --map-column-hive.
> For example it could look like this:
> --map-column-java item='inventory_item(name text, supplier_id integer,price
> numeric)'
> --map-column-hive item='STRUCT<name : string, supplier_id : int, price :
> decimal>'
> In case no additional information is provided some more general type should
> be created if possible.
> It should be possible to serialize the complex datatypes values into strings
> when the Hive target column's type is explicitly set to 'STRING'.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)