I ran into this same problem on the IBM jvm... Didn't spend a lot of time trying to fix it because we got new hardware where I could run the SUN jvm. Sorry.
Aaron On Fri, Jul 16, 2010 at 12:17 PM, Stephen Watt <[email protected]> wrote: > Hi Folks > > This issue occurs on Hive 0.4 and 0.5. I wanted to wait on opening a JIRA > ticket until I ran it by the community first. > > I'm testing Hive 0.5 running on Apache Hadoop 0.20.2 which is using IBM > Java 6 (32 bit x86 Java SR8 : which can be obtained here - > https://www.ibm.com/developerworks/java/jdk/linux/download.html) > > To recreate this I'm using the pokes table loaded with data from the > examples directory, per the tutorial and I run the following in the Hive CLI > (bin/hive) : select count(1) from pokes; > > This works just fine on Sun/Oracle Java 6, but when I change the Hadoop-env > to point to IBM Java 6 it fails in the Map with the following exception : > > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector > incompatible with > org.apache.hadoop.hive.serde2.objectinspector.primitive.LongObjectInspector > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCount$GenericUDAFCountEvaluator.merge(GenericUDAFCount.java:104) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:113) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:451) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:591) > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:500) > ... 14 more > > Note, the line number in GenericUDAFCount here is off by 4 based on a > couple of LOG.info calls I added for debugging purposes. The net of it is > that it is failing when it attempts to do the following cast in the merge > method: > (LongObjectInspector)inputOI > > This is where it gets weird. In SUN Java, this method gets called in the > Reducer. In IBM Java, it gets called in the Mapper. If I use EXPLAIN in the > Hive CLI, the execution plans are identical regardless of which JRE is being > used in Hadoop. In SUN Java, the type for inputOI is a BigInt which is being > derived off of a single column schema called _col0_ in the reducer (likely > the output tuple of the count result) and casts to a Long with no problem. > In IBM Java, this call is happening in the Map and inputOI is being derived > off of what appears to be the first column of the Spokes table schema, which > is an int and is therefore failing when being cats to a Long. It appears the > cast is merely symptomatic of a difference in the execution plans. > > Debugging from this point, really requires someone who understands HIVE > execution plans better than I do. Is there anyone that can help with this > issue? This is really easy to replicate. Download the IBM JDK, mod your > hadoop env to point to the extracted dir of the IBM JDK and do a select > count from any table. > > Regards > Steve Watt
