[ https://issues.apache.org/jira/browse/DATAFU-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984938#comment-13984938 ]
Sam Steingold commented on DATAFU-45: ------------------------------------- I tried that and got an error: {code} my_stage1 = foreach my_in { keywords = TOKENIZE(many,' '); weight = 1.0/(double)SIZE(keywords); generate id, keywords.token as keywords, weight as weight; }; describe my_stage1; -- my_stage1: {id: chararray,keywords: {(token: chararray)},weight: double} dump my_stage1; (1,{(k),(l),(m)},0.3333333333333333) (3,{(i),(j)},0.5) (1,{(i),(k)},0.5) (3,{(l),(i)},0.5) (1,{(m)},1.0) (3,{(m),(i),(k)},0.3333333333333333) (2,{(l),(k),(i)},0.3333333333333333) (3,{(j),(m)},0.5) (2,{(k)},1.0) (3,{(m),(k)},0.5) (2,{(k),(l)},0.5) (3,{(l),(m)},0.5) my_stage2 = foreach my_stage1 { keywords = cross keywords, weight; generate id, keywords; }; describe my_stage2; -- my_stage2: {id: chararray,keywords: {(keywords::token: chararray,null::weight: double)}} dump my_stage2; ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias my_stage2. Backend error : java.lang.Double cannot be cast to org.apache.pig.data.Tuple {code} > RFE: CartesianProduct > --------------------- > > Key: DATAFU-45 > URL: https://issues.apache.org/jira/browse/DATAFU-45 > Project: DataFu > Issue Type: New Feature > Reporter: Sam Steingold > > Given two bags, produce their [Cartesian > product|http://en.wikipedia.org/wiki/Cartesian_product]: > {code} > B1: bag{T1} > B2: bag{T2} > CartesianProduct(B1,B2): bag{(T1,T2)} > {code} > Use case: > {code} > toks = TOKENIZE((charray)$0,','); > kwds = CartesianProduct(toks, {1.0/(double)SIZE(toks)}); > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)