Hmm, I tested it and it does exist in pig8. I must have been running a fixed version.
I think the other point stands though...we can make it easier to understand these sorts of problems. 2011/12/1 Daniel Dai <[email protected]> > Why the problem not exist in Pig 8? > > Daniel > > On Tue, Nov 29, 2011 at 10:22 PM, Jonathan Coveney <[email protected] > >wrote: > > > In pig9, if you have a UDF which specifies its outputschema and that > output > > schema is wrong, then you with high probability will get an exception > such > > as: > > > > java.lang.ClassCastException: java.lang.Long cannot be cast to > > java.lang.Integer > > at java.lang.Integer.compareTo(Integer.java:37) > > > > Errors like this are rare, but didn't seem to come up in Pig8, but do > > in Pig9 and the opaque error messages can be hard to read. > > > > In this case, there was a UDF that said it was outputting a Long, but > > was in fact outputting an Int. At some point, it tried to cast it over > > and failed. > > > > That said, I wonder if it might be possible to add a runtime check > > that checks the output of say the first output of your EvalFunc, and > > if the type does not match up with the declared OutputSchema, it will > > give you a warning (I don't think it should fail, but it should at > > least warn you to aid in debugging). I don't think this would be too > > hard and would add minimal overhead (compared to the run time of a > > job). We could optionally add a flag or something for a "strict" mode > > viz. schema. > > > > Related to this, when jobs die in opaque ways, I wonder if there might > > be a way to give a clearer sense of where in the pipeline it dies? You > > can check pig.alias and try to figure it out by where in the map or > > reduce it was, but that's tough. I know that pipelining and > > optimizations could make this tough, but having a clearer sense of > > what's going on would help debugging along. > > > > Thoughts? > > >
