Hey John, I haven't hit that one before, but I have some hypothesis we could test if you're up for some trying out some patches I write.
J On Wed, May 22, 2013 at 4:01 PM, John Jensen <[email protected]>wrote: > > I have a curious problem when running a crunch job on (avro) files in a > fairly large set of directories (just slightly less than 100). > After running some fraction of the mappers they start failing with the > exception below. Things work fine with a smaller number of directories. > > The magic > 'zdHJpbmcifSx7Im5hbWUiOiJ2YWx1ZSIsInR5cGUiOiJzdHJpbmcifV19fSwiZGVmYXVsdCI' > string shows up in the 'crunch.inputs.dir' entry in the job config, so I > assume it has something to do with deserializing that value, but reading > through the code I don't see any obvious way how. > > Furthermore, the crunch.inputs.dir config entry is just under 1.5M, so > it would not surprise me if I'm running up against a hadoop limit somewhere. > > Has anybody else seen similar issues? (this is 0.5.0, btw). > > -- John > > java.io.IOException: Split class zdHJp > bmcifSx7Im5hbWUiOiJ2YWx1ZSIsInR5cGUiOiJzdHJpbmcifV19fSwiZGVmYXVsdCI not found > at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:342) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:614) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > Caused by: java.lang.ClassNotFoundException: Class zdHJp > bmcifSx7Im5hbWUiOiJ2YWx1ZSIsInR5cGUiOiJzdHJpbmcifV19fSwiZGVmYXVsdCI not found > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) > at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:340) > ... 7 more > > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
