Hi I have a group and foreach statements as below
grouped = GROUP filterdata BY (page_name,web_session_id); x = foreach grouped { distinct_web_cookie_id= DISTINCT filterdata.web_cookie_id; distinct_encrypted_customer_id= DISTINCT filterdata.encrypted_customer_id; distinct_web_session_id= DISTINCT filterdata.web_session_id; distinct_event_time = DISTINCT filterdata.event_time; distinct_customer_id = DISTINCT filterdata.customer_id; generate flatten(group), COUNT_STAR(distinct_web_cookie_id) AS distinct_web_cookie_id, COUNT_STAR(distinct_encrypted_customer_id) AS distinct_encrypted_customer_id, COUNT_STAR(distinct_customer_id) AS distinct_customer_id, COUNT_STAR(distinct_web_session_id) AS distinct_web_session_id ,COUNT_STAR(filterdata) AS cnt_events; }; Now I want to group on Session_id in x and get the sum of (cnt_events) and written below commands grouped2 = GROUP x BY page_name; d = foreach grouped2 generate group, COUNT_STAR(cnt_events) tot_events; When I run "grouped2 = GROUP x BY page_name;", I get below error: [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1025: <line 31, column 23> Invalid field projection. Projected field [page_name] does not exist in schema: event_time:chararray. When I use describe x, I get output as x: {event_time: chararray} Not sure whether schema for foreach statement works? How do I solve this problem. Thanks