[ https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157146#comment-14157146 ]
Arup Malakar commented on HIVE-7787: ------------------------------------ The exception in the above comment was due to the fact that the hadoop cluster I had run had an older version of parquet in. I did the following and got rid of the error: {{SET mapreduce.job.user.classpath.first=true}} But I hit another issue: {code} Diagnostic Messages for this Task: Error: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:300) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:247) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:371) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:652) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:286) ... 11 more Caused by: java.lang.IllegalStateException: Field count must be either 1 or 2: 3 at org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.<init>(ArrayWritableGroupConverter.java:38) at org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34) at org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:64) at org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:47) at org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36) at org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:64) at org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:40) at org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.<init>(DataWritableRecordConverter.java:35) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:152) at parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142) at parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118) at parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:92) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:71) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65) {code} It appears that ArrayWritableGroupConverter allows either 1 or 2 elements in the structure for an array. Is there a reason for that? Is the following schema not supported? {code} enum MyEnumType { EnumOne, EnumTwo, EnumThree } struct MyStruct { 1: optional MyEnumType myEnumType; 2: optional string field2; 3: optional string field3; } struct outerStruct { 1: optional list<MyStruct> myStructs } {code} I can file another JIRA for this issue. > Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError > ------------------------------------------------------------------------- > > Key: HIVE-7787 > URL: https://issues.apache.org/jira/browse/HIVE-7787 > Project: Hive > Issue Type: Bug > Components: Database/Schema, Thrift API > Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1 > Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0 > Reporter: Raymond Lau > Priority: Minor > > When reading Parquet file, where the original Thrift schema contains a struct > with an enum, this causes the following error (full stack trace blow): > {code} > java.lang.NoSuchFieldError: DECIMAL. > {code} > Example Thrift Schema: > {code} > enum MyEnumType { > EnumOne, > EnumTwo, > EnumThree > } > struct MyStruct { > 1: optional MyEnumType myEnumType; > 2: optional string field2; > 3: optional string field3; > } > struct outerStruct { > 1: optional list<MyStruct> myStructs > } > {code} > Hive Table: > {code} > CREATE EXTERNAL TABLE mytable ( > mystructs array<struct<myenumtype: string, field2: string, field3: string>> > ) > ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' > STORED AS > INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat' > OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat' > ; > {code} > Error Stack trace: > {code} > Java stack trace for Hive 0.12: > Caused by: java.lang.NoSuchFieldError: DECIMAL > at > org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146) > at > org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31) > at > org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.<init>(ArrayWritableGroupConverter.java:45) > at > org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34) > at > org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:64) > at > org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:47) > at > org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36) > at > org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:64) > at > org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.<init>(DataWritableGroupConverter.java:40) > at > org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.<init>(DataWritableRecordConverter.java:32) > at > org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128) > at > parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142) > at > parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118) > at > parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:92) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66) > at > org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65) > ... 16 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)