Hi, Looks like a bug to me. Can you open a JIRA?
Brock On Wed, Feb 12, 2014 at 9:25 AM, Remus Rusanu <rem...@microsoft.com> wrote: > While working on HIVE-5998 I noticed that the ParquetRecordReader > returns IntWritable for all 'int like' types, in disaccord with the row > object inspectors. I though fine, and I worked my way around it. But I see > now that the issue trigger failuers in other places, eg. in aggregates: > > > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > {"cint":528534767,"ctinyint":31,"csmallint":4963,"cfloat":31.0,"cdouble":4963.0,"cstring1":"cvLH6Eat2yFsyy7p"} > > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) > > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) > > ... 8 more > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be > cast to java.lang.Short > > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808) > > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) > > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) > > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) > > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) > > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) > > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) > > ... 9 more > > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable > cannot be cast to java.lang.Short > > at > org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41) > > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671) > > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631) > > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109) > > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96) > > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183) > > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641) > > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838) > > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735) > > at > org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803) > > ... 15 more > > > > > > My test is (I'm writing a test .q from HIVE-5998, but the repro does not > involve vectorization): > > > > create table if not exists alltypes_parquet ( > > cint int, > > ctinyint tinyint, > > csmallint smallint, > > cfloat float, > > cdouble double, > > cstring1 string) stored as parquet; > > > > insert overwrite table alltypes_parquet > > select cint, > > ctinyint, > > csmallint, > > cfloat, > > cdouble, > > cstring1 > > from alltypesorc; > > > > explain select * from alltypes_parquet limit 10; > > select * from alltypes_parquet limit 10; > > > > explain select ctinyint, > > max(cint), > > min(csmallint), > > count(cstring1), > > avg(cfloat), > > stddev_pop(cdouble) > > from alltypes_parquet > > group by ctinyint; > > select ctinyint, > > max(cint), > > min(csmallint), > > count(cstring1), > > avg(cfloat), > > stddev_pop(cdouble) > > from alltypes_parquet > > group by ctinyint; > > > > Before opening a Jira, I thought about asking, perhaps this is a known > issue and some solution is in the books? > > > > Thanks, > > ~Remus > > > > > -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org