Re: support for arrays, maps, structs while writing output of custom reduce script to table

Zheng Shao Mon, 22 Mar 2010 15:27:06 -0700

Great!

This is a bug. Hive field names should be case-insensitive. Can you
open a JIRA for that?


Zheng
On Mon, Mar 22, 2010 at 2:43 PM, Dilip Joseph
<dilip.antony.jos...@gmail.com> wrote:
> Thanks Zheng,  That worked.
>
> It appears that the type information is converted to lower case before
> comparison.  The following statements where "userId" is used as a
> field name failed.
>
> hive> CREATE TABLE SS (
>    >                     a INT,
>    >                     b INT,
>    >                     vals ARRAY<STRUCT<userId:INT, y:STRING>>
>    >                 );
> OK
> Time taken: 0.309 seconds
> hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s
>    >     INSERT OVERWRITE TABLE SS
>    >     REDUCE *
>    >         USING 'myreduce.py'
>    >         AS
>    >                     (a INT,
>    >                     b INT,
>    >                     vals ARRAY<STRUCT<userId:INT, y:STRING>>
>    >                     )
>    >         ;
> FAILED: Error in semantic analysis: line 2:27 Cannot insert into
> target table because column number/types are different SS: Cannot
> convert column 2 from array<struct<userId:int,y:string>> to
> array<struct<userid:int,y:string>>.
>
> The same queries worked fine after changing "userId" to "userid".
>
> Dilip
>
> On Mon, Mar 22, 2010 at 2:20 PM, Zheng Shao <zsh...@gmail.com> wrote:
>> From 0.5 (probably), we can add type information to the column names after 
>> "AS".
>> Note that the first level separator should be TAB, and the second
>> separator should be ^B (and then ^C, etc)
>>
>>> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s
>>>    INSERT OVERWRITE TABLE SS
>>>    REDUCE *
>>>        USING 'myreduce.py'
>>>        AS
>>>                (a INT, b INT, vals ARRAY<STRUCT<x:INT, y:STRING>>)
>>>        ;
>>
>>
>> On Mon, Mar 22, 2010 at 1:50 PM, Dilip Joseph
>> <dilip.antony.jos...@gmail.com> wrote:
>>> Hello,
>>>
>>> Does Hive currently support arrays, maps, structs while using custom
>>> reduce/map scripts? 'myreduce.py' in the example below produces an
>>> array of structs delimited by \2s and \3s.
>>>
>>> CREATE TABLE SS (
>>>                    a INT,
>>>                    b INT,
>>>                    vals ARRAY<STRUCT<x:INT, y:STRING>>
>>>                );
>>>
>>> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s
>>>    INSERT OVERWRITE TABLE SS
>>>    REDUCE *
>>>        USING 'myreduce.py'
>>>        AS
>>>                (a,b, vals)
>>>        ;
>>>
>>> However, the query is failing with the following error message, even
>>> before the script is executed:
>>>
>>> FAILED: Error in semantic analysis: line 2:27 Cannot insert into
>>> target table because column number/types are different SS: Cannot
>>> convert column 2 from string to array<struct<x:int,y:string>>.
>>>
>>> I saw a discussion about this in
>>> http://www.mail-archive.com/hive-user@hadoop.apache.org/msg00160.html,
>>> dated over a year ago.  Just wondering if there have been any updates.
>>>
>>> Thanks,
>>>
>>> Dilip
>>>
>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> _________________________________________
> Dilip Antony Joseph
> http://www.marydilip.info
>



-- 
Yours,
Zheng

Re: support for arrays, maps, structs while writing output of custom reduce script to table

Reply via email to