Re: support for arrays, maps, structs while writing output of custom reduce script to table

Dilip Joseph Tue, 23 Mar 2010 09:30:30 -0700

Opened JIRA https://issues.apache.org/jira/browse/HIVE-1271


Dilip

On Mon, Mar 22, 2010 at 3:26 PM, Zheng Shao <[email protected]> wrote:
> Great!
>
> This is a bug. Hive field names should be case-insensitive. Can you
> open a JIRA for that?
>
> Zheng
> On Mon, Mar 22, 2010 at 2:43 PM, Dilip Joseph
> <[email protected]> wrote:
>> Thanks Zheng,  That worked.
>>
>> It appears that the type information is converted to lower case before
>> comparison.  The following statements where "userId" is used as a
>> field name failed.
>>
>> hive> CREATE TABLE SS (
>>    >                     a INT,
>>    >                     b INT,
>>    >                     vals ARRAY<STRUCT<userId:INT, y:STRING>>
>>    >                 );
>> OK
>> Time taken: 0.309 seconds
>> hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s
>>    >     INSERT OVERWRITE TABLE SS
>>    >     REDUCE *
>>    >         USING 'myreduce.py'
>>    >         AS
>>    >                     (a INT,
>>    >                     b INT,
>>    >                     vals ARRAY<STRUCT<userId:INT, y:STRING>>
>>    >                     )
>>    >         ;
>> FAILED: Error in semantic analysis: line 2:27 Cannot insert into
>> target table because column number/types are different SS: Cannot
>> convert column 2 from array<struct<userId:int,y:string>> to
>> array<struct<userid:int,y:string>>.
>>
>> The same queries worked fine after changing "userId" to "userid".
>>
>> Dilip
>>
>> On Mon, Mar 22, 2010 at 2:20 PM, Zheng Shao <[email protected]> wrote:
>>> From 0.5 (probably), we can add type information to the column names after 
>>> "AS".
>>> Note that the first level separator should be TAB, and the second
>>> separator should be ^B (and then ^C, etc)
>>>
>>>> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s
>>>>    INSERT OVERWRITE TABLE SS
>>>>    REDUCE *
>>>>        USING 'myreduce.py'
>>>>        AS
>>>>                (a INT, b INT, vals ARRAY<STRUCT<x:INT, y:STRING>>)
>>>>        ;
>>>
>>>
>>> On Mon, Mar 22, 2010 at 1:50 PM, Dilip Joseph
>>> <[email protected]> wrote:
>>>> Hello,
>>>>
>>>> Does Hive currently support arrays, maps, structs while using custom
>>>> reduce/map scripts? 'myreduce.py' in the example below produces an
>>>> array of structs delimited by \2s and \3s.
>>>>
>>>> CREATE TABLE SS (
>>>>                    a INT,
>>>>                    b INT,
>>>>                    vals ARRAY<STRUCT<x:INT, y:STRING>>
>>>>                );
>>>>
>>>> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s
>>>>    INSERT OVERWRITE TABLE SS
>>>>    REDUCE *
>>>>        USING 'myreduce.py'
>>>>        AS
>>>>                (a,b, vals)
>>>>        ;
>>>>
>>>> However, the query is failing with the following error message, even
>>>> before the script is executed:
>>>>
>>>> FAILED: Error in semantic analysis: line 2:27 Cannot insert into
>>>> target table because column number/types are different SS: Cannot
>>>> convert column 2 from string to array<struct<x:int,y:string>>.
>>>>
>>>> I saw a discussion about this in
>>>> http://www.mail-archive.com/[email protected]/msg00160.html,
>>>> dated over a year ago.  Just wondering if there have been any updates.
>>>>
>>>> Thanks,
>>>>
>>>> Dilip
>>>>
>>>
>>>
>>>
>>> --
>>> Yours,
>>> Zheng
>>>
>>
>>
>>
>> --
>> _________________________________________
>> Dilip Antony Joseph
>> http://www.marydilip.info
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
_________________________________________
Dilip Antony Joseph
http://www.marydilip.info

Re: support for arrays, maps, structs while writing output of custom reduce script to table

Reply via email to