With "HYBRID" can you try with "hive.orc.cache.use.soft.references=true"? That should help in preventing OOM with Hybrid strategy.
~Rajesh.B On Wed, Sep 13, 2017 at 2:54 PM, Jay <jayadeep.jayara...@gmail.com> wrote: > Hi All, > > I am running a simple select query as below > > select distinct vehicle_no from > rmd.gets_dw_eoa_eng_rec_dtl_orc_ext_concat_final_eng3 > where incident_dt = '2999-01-01'; > > The table is a 2 level partitioned table as shown below > > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2010-01-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2011-01-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:35 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2012-01-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2013-01-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-01-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-02-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-03-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:36 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-04-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:34 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-05-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:33 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-06-01 > drwx------ - gpadmin hdfs 0 2017-09-12 14:33 > /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_ > ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-07-01 > > > The ORC files have been created with a rough size of 2 GB and have ZLIB > compression. > > When the hive.exec.orc.split.strategy is set to HYBRID in our HDP 2.6.1 > cluster the MAP phase is stuck in the INITIALIZATION phases and after about > 20 minutes it fails with OOM. > > When I change hive.exec.orc.split.strategy to BI the SQL runs fine without > any issues. > > My question is what parameter controls the memory assigned while Hive/Tez > generates the splits? > > the hive container size is set to 8GB > > Thanks, > Jayadeep >