What are the "AS" columns used for in TRANSFORM USING?  All I can find is
http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform#Schema-less_Map-reduce_Scriptsbut
that only mentions what happens when it isn't there.  It seems like it
doesn't really matter what the column names are.  And, more than once I've
used table column names, but in a different order than the table structure,
but they went into the table in the table-structure order, not the AS column
order.  That confused me a lot.

In this example, table "t" is used just to get some literals selected:

create table test2(a string, b string, c string)

# AS column names in a different order
from t insert overwrite table test2 select transform('foo', 'bar', 'baz')
USING '/bin/cat' AS (b, a, c) limit 1
# I expect 'bar', 'foo', 'baz'
select * from test2
> ['foo', 'bar', 'baz']
# but I get this, and am confused
# (even more so when type conversion happens with non-string columns)

# AS column names don't matter at all actually
from t insert overwrite table test2 select transform('foo', 'bar', 'baz')
USING '/bin/cat' AS (x, y, z) limit 1
select * from test2
> ['foo', 'bar', 'baz']


I'd recommend that Hive either support column reordering with the AS
statement, or make it completely optional (although this may be
backwards-incompatible with the docs at the link above).

-- 
Dave Brondsema
Software Engineer
Geeknet

www.geek.net

Reply via email to