insert into originalTable select uniqueId, collect_set(whatever) from explodedTable group by uniqueId
will probably do the trick. Phil. On 23 August 2012 17:45, Mike Fleming <m...@obvious.com> wrote: > I see that hive has away to take a table and produce multiple rows. > > Is there a built in way to do the reverse? > > Say I have a table with a unique key and an array. I do this: > >> insert into explodedTable select uniqueId, explode(arrayOfThings) from >> originalTable > > Now I have a table with a row for each (uniqueId, element in arrayOfThings). > > Is there any way to take the contents of explodedTable and essentially > produce the original table, reconstructing the arrayOfThings for each > uniqueId? > > It seems, conceptually, that if I "cluster by uniqueId" then a reducer knows > that it will get all rows for each uniqueId bundled together, so it ought to > be fairly feasible to simply emit an unexploded row. However, I can't seem > to find a built-in way to do this. > > Mike >