1. In the current implementation, explode does have that contract.

2. WHERE is allowed, but there was a bug with predicate push down that caused a 
NPE to be thrown. There's a patch for this at 
https://issues.apache.org/jira/browse/HIVE-1056 or you can add 'set 
hive.optimize.ppd=false;' before you run your queries.

3. Yes, you would need to override this method.

4. If you want to output two columns, check out GenericUDTFExplode2 in contrib 
for an example on how this can be done.


From: Viraj Bhat [mailto:[email protected]]
Sent: Tuesday, June 29, 2010 5:06 PM
To: [email protected]
Subject: RE: Best way to create a view with index from explode ?

Hi Paul,
 Thanks for your inputs. I was looking at the Generic explode UDTF in your 
piggybank in the contrib. directory.
Here are questions:
1)     Does the explode have a contract that if we take in an array of maps we 
return only the maps, not maps and another column (index) with it.
2)     Second do UDTF's have some restriction that I cannot have a WHERE clause 
or select other columns. I want to write something like this:
select  bcookie, Myexplode(info) as (newcol, index) from myoldtable where 
bcookie is not null;
3)     For adding the additional index column, do I need to override the 
following method  [[public void process(Object[] o) throws HiveException ]]
4)     Do you have other examples of writing UDTF's
Any insights would be appreciated.
Viraj

________________________________
From: Paul Yang [mailto:[email protected]]
Sent: Tuesday, June 29, 2010 3:01 PM
To: [email protected]
Subject: RE: Best way to create a view with index from explode ?

Yes, the UDTF explode() is probably the easiest way to get multiple rows from 
an array. Another way would be to write your own custom script and use the 
transform clause. To generate the index of the info, you'll need the field() 
UDF, but there is currently an issue with it:

https://issues.apache.org/jira/browse/HIVE-1385


From: Viraj Bhat [mailto:[email protected]]
Sent: Tuesday, June 29, 2010 2:06 PM
To: [email protected]
Subject: Best way to create a view with index from explode ?

Hi all,
I have an input table which has 3 columns one of which is an array list,

bcookie, id, info [Schema: string, string, arraylist<map<string,string>>]

Here is a sample row

1245, 1, {[myid#id2, myage#100], [myid#id3, myage#101], [myid#id3, myage#102]}

I wanted to explode the arraylist column and create an additional column which 
represents the index of the info

bcookie, id, info, index
1245, 1, [myid#id2, myage#100], 0
1245, 1, [myid#id3, myage#101], 1
1245, 1, [myid#id3, myage#102] , 2

I was thinking of the lines of:
SELECT * FROM srctable LATERAL VIEW explode(info)... ;

Is UDTF the only way to achieve this?
Thanks Viraj

Reply via email to