On Wed, Nov 28, 2012 at 1:29 PM, Brad Cavanagh <brad.cavan...@gmail.com>wrote:
> I have a table defined as: > > CREATE TABLE foo ( > id INT, > start_time STRING, > name STRING, > value STRING, > type STRING > ) > > The 'name', 'value' and 'type' fields actually describe another object, > and I'd like to turn these into STRUCTs, something like: > > CREATE TABLE bar ( > id INT, > start_time STRING, > object STRUCT<name: STRING, > value: STRING, > type: STRING> > ) > > However, I'd also like to create a rollup table containing the most recent > result for a given 'id' field. There can be multiple different 'object' > structs for a given combination of 'id' and 'start_time', so I thought I'd > turn this into an array of structs. My ideal rollup table would look like: > > CREATE TABLE rollup ( > id INT, > start_time STRING, > objects ARRAY<STRUCT<name: STRING, > value: STRING, > type: STRING>> > ) > > However, I can't do this because the following query fails: > > INSERT OVERWRITE TABLE rollup > SELECT id, start_time, collect_set(object) > FROM bar > GROUP BY id, start_time > > Here's the error I get: > > FAILED: UDFArgumentTypeException Only primitive type arguments are > accepted but struct<name:string,value:string,type:string> was passed as > parameter 1. > > Is there any way I can do this? > > I'm using Hive 0.9.0 as installed using CDH4.1.0: > > % yum info hive.noarch > Version : 0.9.0+148 > Release : 1.cdh4.1.0.p0.27.el6 > > Cheers, > Brad. > You are going to have to write your own UDF. collect set could be extended to support nested complex types but it does not do that right now.