Which version are you using? I am wondering whether PIG-3466 fixes your error- https://issues.apache.org/jira/browse/PIG-3466
You can reproduce the error only when loading more data. You also see a random type cast error. My guess is that you ran into the race condition that PIG-3466 fixed, and your bag is corrupted resulting in the type cast error. On Mon, Nov 18, 2013 at 6:30 AM, Noam Lavie <[email protected]> wrote: > Hi, > I'm trying to run the following pig script (it main purpose is to read > inputs that contains info about phone calls, the script suppose to count > the different types of calls and the different subscribers that made them): > > SET default_parallel 40; > allFiles = LOAD > 'maprfs:///analytics/data/consumers/mapred/facts/done/FACT_VOICE_GE_Analytics9_1/20131114/' > USING PigStorage(','); > allFilesFiltered = FILTER allFiles BY $11 MATCHES '.*On.*' AND $4 > 0; > datesList = FOREACH allFilesFiltered GENERATE SUBSTRING($0, 0, 10) AS day, > $11 AS callType, $4 AS amount, $1 AS subscriberKey; > datesGroups = GROUP datesList BY (day, callType); > datesGroupsAmount = foreach datesGroups { > unique_seubscriber = DISTINCT datesList.subscriberKey; > GENERATE group.day, group.callType, COUNT(datesList), > SUM(datesList.amount), COUNT(unique_seubscriber); > }; > dump datesGroupsAmount; > > the problem is with the unique_seubscriber. The count and distinct > doesn't work. The strange thing is that if I run script separately for each > sub folder's input - the run will succeed for each part, but if I'm giving > the hall inputs folders together it fails and I get the following error: > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open > iterator for alias datesGroupsAmount > > Another error that I get from time to time (if I'm making small changes in > the script) is: > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open > iterator for alias datesGroupsAmount. Backend error : java.lang.Boolean > cannot be cast to org.apache.pig.data.Tuple (myne there is a connection > between the two errors?) > > Here is the log file: > > Pig Stack Trace > --------------- > ERROR 1066: Unable to open iterator for alias datesGroupsAmount > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias datesGroupsAmount > at > org.apache.pig.PigServer.openIterator(PigServer.java:836) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:604) > at org.apache.pig.Main.main(Main.java:157) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > at > org.apache.pig.PigServer.openIterator(PigServer.java:828) > ... 12 more > > > any help will be appreciate > thanks > Noam > > > ________________________________ > > This email contains proprietary and/or confidential information of Pontis. > If you have received this email in error, please delete all copies without > delay and do not copy, distribute, or rely on any information contained in > this email. >
