On Wed, Nov 28, 2012 at 1:29 PM, Brad Cavanagh <brad.cavan...@gmail.com>wrote:

> I have a table defined as:
>
> CREATE TABLE foo (
>   id INT,
>   start_time STRING,
>   name STRING,
>   value STRING,
>   type STRING
> )
>
> The 'name', 'value' and 'type' fields actually describe another object,
> and I'd like to turn these into STRUCTs, something like:
>
> CREATE TABLE bar (
>   id INT,
>   start_time STRING,
>   object STRUCT<name: STRING,
>     value: STRING,
>     type: STRING>
> )
>
> However, I'd also like to create a rollup table containing the most recent
> result for a given 'id' field. There can be multiple different 'object'
> structs for a given combination of 'id' and 'start_time', so I thought I'd
> turn this into an array of structs. My ideal rollup table would look like:
>
> CREATE TABLE rollup (
>   id INT,
>   start_time STRING,
>   objects ARRAY<STRUCT<name: STRING,
>     value: STRING,
>     type: STRING>>
> )
>
> However, I can't do this because the following query fails:
>
> INSERT OVERWRITE TABLE rollup
> SELECT id, start_time, collect_set(object)
> FROM bar
> GROUP BY id, start_time
>
> Here's the error I get:
>
> FAILED: UDFArgumentTypeException Only primitive type arguments are
> accepted but struct<name:string,value:string,type:string> was passed as
> parameter 1.
>
> Is there any way I can do this?
>
> I'm using Hive 0.9.0 as installed using CDH4.1.0:
>
>  % yum info hive.noarch
> Version     : 0.9.0+148
> Release     : 1.cdh4.1.0.p0.27.el6
>
> Cheers,
>  Brad.
>

You are going to have to write your own UDF. collect set could be extended
to support nested complex types but it does not do that right now.

Reply via email to