Geez, too many mistakes for the day :P I meant the following above
*CREATE TABLE* rollup_new *AS* SELECT id, start_time, collect_set(concat_ws(",", object.name <http://objects.name/>, object.value, object.type)) AS product_details FROM bar GROUP BY id, start_time; The change is in the table creation query, since an INSERT OVERWRITE would expect a struct and get a string instead, thus throwing errors right? On Wed, Sep 3, 2014 at 2:06 PM, Nishant Kelkar <nishant....@gmail.com> wrote: > Sorry, I meant the following in my example: > > INSERT OVERWRITE TABLE rollup > SELECT id, start_time, collect_set(concat_ws(",", object.name > <http://objects.name/>, object.value, object.type)) AS product_details > FROM bar > GROUP BY id, start_time; > > "object" instead of "object*s*", as you have in your example :) > > Best Regards, > Nishant Kelkar > > > On Wed, Sep 3, 2014 at 2:03 PM, Nishant Kelkar <nishant....@gmail.com> > wrote: > >> I don't know of anything like what you want atleast until Hive 0.11. >> >> However, you could try something like this: >> >> INSERT OVERWRITE TABLE rollup >> SELECT id, start_time, collect_set(concat_ws(",", objects.name, >> objects.value, objects.type)) AS product_details >> FROM bar >> GROUP BY id, start_time; >> >> >> It's a bit hacky, but it does the trick. It basically concats all the >> objects you care about in a string, and then collects a set of that. Then, >> when you want to extract a field from the "product_details" array, just do >> this (say you want to extract the very first product's name in each array): >> >> SELECT SPLIT(a[0].product_details)[0] AS first_name FROM rollup a; >> >> >> Hope that helps! >> >> Best Regards, >> Nishant Kelkar >> >> >> >> On Wed, Sep 3, 2014 at 1:47 PM, anusha Mangina <anusha.mang...@gmail.com> >> wrote: >> >>> I have a table defined as: >>> >>> CREATE TABLE foo ( >>> id INT, >>> start_time STRING, >>> name STRING, >>> value STRING, >>> type STRING >>> ) >>> >>> The 'name', 'value' and 'type' fields actually describe another object, and >>> I'd like to turn these into STRUCTs, something like: >>> >>> CREATE TABLE bar ( >>> id INT, >>> start_time STRING, >>> object STRUCT<name: STRING, >>> value: STRING, >>> type: STRING> >>> ) >>> >>> However, I'd also like to create a rollup table containing the most recent >>> result for a given 'id' field. There can be multiple different 'object' >>> structs for a given combination of 'id' and 'start_time', so I thought I'd >>> turn this into an array of structs. My ideal rollup table would look like: >>> >>> CREATE TABLE rollup ( >>> id INT, >>> start_time STRING, >>> objects ARRAY<STRUCT<name: STRING, >>> value: STRING, >>> type: STRING>> >>> ) >>> >>> However, I can't do this because the following query fails: >>> >>> INSERT OVERWRITE TABLE rollup >>> SELECT id, start_time, collect_set(object) >>> FROM bar >>> GROUP BY id, start_time >>> >>> Here's the error I get: >>> >>> FAILED: UDFArgumentTypeException Only primitive type arguments are accepted >>> but struct<name:string,value:string,type:string> was passed as parameter 1. >>> >>> Is there any way I can do this? >>> >>> >> >