[ https://issues.apache.org/jira/browse/PIG-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich reassigned PIG-514: ---------------------------------- Assignee: Pradeep Kamath > COUNT returns no results as a result of two filter statements in FOREACH > ------------------------------------------------------------------------ > > Key: PIG-514 > URL: https://issues.apache.org/jira/browse/PIG-514 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.2.0 > Reporter: Viraj Bhat > Assignee: Pradeep Kamath > Attachments: mystudentfile.txt > > > For the following piece of sample code in FOREACH which counts the filtered > student records based on record_type == 1 and scores and also on record_type > == 0 does not seem to return any results. > {code} > mydata = LOAD 'mystudentfile.txt' AS (record_type,name,age,scores,gpa); > --keep only what we need > mydata_filtered = FOREACH mydata GENERATE record_type, name, age, > scores ; > --group > mydata_grouped = GROUP mydata_filtered BY (record_type,age); > myfinaldata = FOREACH mydata_grouped { > myfilter1 = FILTER mydata_filtered BY record_type == 1 AND age == scores; > myfilter2 = FILTER mydata_filtered BY record_type == 0; > GENERATE FLATTEN(group), > -- Only this count causes the problem ?? > COUNT(myfilter1) as col2, > SUM(myfilter2.scores) as col3, > COUNT(myfilter2) as col4; }; > --these set of statements confirm that the count on the filters returns 1 > --mycountdata = FOREACH mydata_grouped > --{ > -- myfilter1 = FILTER mydata_filtered BY record_type == 1 AND age == > scores; > -- GENERATE > -- COUNT(myfilter1) as colcount; > --}; > --dump mycountdata; > dump myfinaldata; > {code} > But if you uncomment the {code} COUNT(myfilter1) as col2, {code}, it seems > to work with the following results.. > (0,22,45.0,2L) > (0,24,133.0,6L) > (0,25,22.0,1L) > Also I have tried to verify if this is a issue with the {code} > COUNT(myfilter1) as col2, {code} returning zero. It does not seem to be the > case. > If {code} dump mycountdata; {code} is uncommented it returns: > (1L) > (1L) > I am attaching the tab separated 'mystudentfile.txt' file used in this Pig > script. Is this an issue with 2 filters in the FOREACH followed by a COUNT on > these filters?? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.