[
https://issues.apache.org/jira/browse/SQOOP-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Benes updated SQOOP-2471:
-------------------------------
Summary: Support arrays and structs datatypes with Sqoop Hcatalog
integration (was: Support complex datatypes with Sqoop Hcatalog integration)
> Support arrays and structs datatypes with Sqoop Hcatalog integration
> --------------------------------------------------------------------
>
> Key: SQOOP-2471
> URL: https://issues.apache.org/jira/browse/SQOOP-2471
> Project: Sqoop
> Issue Type: New Feature
> Components: hive-integration
> Affects Versions: 1.4.6
> Reporter: Pavel Benes
> Priority: Critical
>
> Currently sqoop import is not able to handle any complex type. On the other
> side the hive already has support for the following complex types:
> - arrays: ARRAY<data_type>
> - structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
> - maps: MAP<primitive_type, data_type>
> - union: UNIONTYPE<data_type, data_type, ...>
> the most frequent/important is probably the ARRAY type followed by the STRUCT
> type.
> Since it is probably not possible to obtain all necessary information about
> those types from general JDBC database, this feature should somehow use an
> external information provided by arguments --map-column-java and
> --map-column-hive.
> For example it could look like this:
> --map-column-java item='inventory_item(name text, supplier_id integer,price
> numeric)'
> --map-column-hive item='STRUCT<name : string, supplier_id : int, price :
> decimal>'
> In case no additional information is provided some more general type should
> be created if possible.
> It should be possible to serialize the complex datatypes values into strings
> when the Hive target column's type is explicitly set to 'STRING'.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)