Retried with hive.optimize.sort.dynamic.partition=false. Still seeing the same issue.
Thanks Suma On Wed, Jul 30, 2014 at 6:55 PM, Nitin Pawar <nitinpawar...@gmail.com> wrote: > what's the value of the variable hive.optimize.sort.dynamic.partition > > can you try disabling it if it on? > > > On Wed, Jul 30, 2014 at 6:43 PM, Suma Shivaprasad < > sumasai.shivapra...@gmail.com> wrote: > >> Am using 0.13.0 version of hive with parquet table having 34 columns with >> the following props while creating the table >> >> >> *CLUSTERED BY (udid) SORTED BY (udid ASC) INTO 256 BUCKETS >> STORED as PARQUET >> TBLPROPERTIES ("parquet.compression"="SNAPPY"); * >> >> The query I am running is >> >> >> *set hive.optimize.bucketmapjoin = true; >> set hive.optimize.bucketmapjoin.sortedmerge = true; >> set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; >> set hive.mapjoin.smalltable.filesize=200000000; >> set hive.vectorized.execution.enabled = true;* >> >> >> *set hive.stats.fetch.column.stats=true; >> set hive.stats.collect.tablekeys=true; >> set hive.stats.reliable=true;* >> >> *select sum(rev),sum(adimp) >> from user_rr_parq rr join user_domain_parq dm on rr.udid = dm.id >> <http://dm.id> >> where dt = '..' and hour='..' >> and dm.age.source = '..' >> and dm.age.id <http://dm.age.id> IN ('..') >> group by rr.udid;* >> >> >> with both user_rr_parq and user_domain_parq both clustered and sorted by >> same join key >> >> *Exception in Mapper logs* >> >> 2014-07-30 12:44:08,577 INFO org.apache.hadoop.mapred.TaskLogsTruncater: >> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 >> >> >> >> 2014-07-30 12:44:08,579 WARN org.apache.hadoop.mapred.Child: Error running >> child >> java.lang.RuntimeException: >> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while >> processing row >> {"udid":"+HkGEOKZopHELKtUDdJzOUPr5yuSHxTHN5iknyzNSjE=","optout":null,"uage":null,"ugender":null,"siteid":null,"handsetid":null,"intversion":null,"intmethod":null,"intfamily":null,"intdirect":null,"intorigin":null,"advid":null,"campgnid":null,"adgrpidbig":null,"ccid":null,"locsrc":null,"adid":null,"adidbig":null,"market":null,"nfr":null,"uidparams":null,"time":null,"disc_uidparams":null,"vldclk":null,"fraudclk":null,"totalburn":null,"pubcpc":null,"cpc":null,"rev":0.0,"adimp":0,"pgimp":null,"mkvalidadreq":null,"mkvalidpgreq":null,"map_uid":null,"dt":"2014-06-01","hour":"00"} >> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) >> at org.apache.hadoop.mapred.Child.main(Child.java:262) >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime >> Error while processing row >> {"udid":"+HkGEOKZopHELKtUDdJzOUPr5yuSHxTHN5iknyzNSjE=","optout":null,"uage":null,"ugender":null,"siteid":null,"handsetid":null,"intversion":null,"intmethod":null,"intfamily":null,"intdirect":null,"intorigin":null,"advid":null,"campgnid":null,"adgrpidbig":null,"ccid":null,"locsrc":null,"adid":null,"adidbig":null,"market":null,"nfr":null,"uidparams":null,"time":null,"disc_uidparams":null,"vldclk":null,"fraudclk":null,"totalburn":null,"pubcpc":null,"agencycpc":null,"rev":0.0,"adimp":0,"pgimp":null,"mkvalidadreq":null,"mkvalidpgreq":null,"map_uid":null,"dt":"2014-06-01","hour":"00"} >> at >> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) >> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) >> ... 8 more*Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: >> java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 29, Size: 5* >> at >> org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.nextHive(SMBMapJoinOperator.java:773) >> at >> org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.setupContext(SMBMapJoinOperator.java:710) >> at >> org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.setUpFetchContexts(SMBMapJoinOperator.java:538) >> at >> org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.processOp(SMBMapJoinOperator.java:248) >> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) >> at >> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) >> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) >> at >> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540) >> ... 9 more*Caused by: java.io.IOException: >> java.lang.IndexOutOfBoundsException: Index: 29, Size: 5* >> at >> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:636) >> at >> org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.next(SMBMapJoinOperator.java:794) >> at >> org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.nextHive(SMBMapJoinOperator.java:771) >> ... 16 more*Caused by: java.lang.IndexOutOfBoundsException: Index: 29, >> Size: 5* >> at java.util.ArrayList.RangeCheck(ArrayList.java:547) >> at java.util.ArrayList.get(ArrayList.java:322) >> at >> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:96) >> at >> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) >> at >> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:79) >> at >> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66) >> at >> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) >> at >> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:471) >> at >> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:561) >> ... 18 more >> >> Looks like it is trying to access the column with index as 29 where as there >> are only 5 non null columns being present in the row - which matches the >> Arraylist size. >> >> What could be going wrong here? >> >> >> Thanks >> >> Suma >> >> >> >> > > > -- > Nitin Pawar >