With "HYBRID" can you try with "hive.orc.cache.use.soft.references=true"?
That should help in preventing OOM with Hybrid strategy.

~Rajesh.B

On Wed, Sep 13, 2017 at 2:54 PM, Jay <jayadeep.jayara...@gmail.com> wrote:

> Hi All,
>
> I am running a simple select query as below
>
> select distinct vehicle_no from 
> rmd.gets_dw_eoa_eng_rec_dtl_orc_ext_concat_final_eng3
> where incident_dt = '2999-01-01';
>
> The table is a 2 level partitioned table as shown below
>
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2010-01-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2011-01-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:35
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2012-01-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2013-01-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-01-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-02-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-03-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:36
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-04-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:34
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-05-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:33
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-06-01
> drwx------   - gpadmin hdfs          0 2017-09-12 14:33
> /apps/hive/warehouse/rmd.db/gets_dw_eoa_eng_rec_dtl_orc_
> ext_concat_final_eng3/source_type_cd=ENG3/incident_dt=2014-07-01
>
>
> The ORC files have been created with a rough size of 2 GB and have ZLIB
> compression.
>
> When the hive.exec.orc.split.strategy is set to HYBRID in our HDP 2.6.1
> cluster the MAP phase is stuck in the INITIALIZATION phases and after about
> 20 minutes it fails with OOM.
>
> When I change hive.exec.orc.split.strategy to BI the SQL runs fine without
> any issues.
>
> My question is what parameter controls the memory assigned while Hive/Tez
> generates the splits?
>
> the hive container size is set to 8GB
>
> Thanks,
> Jayadeep
>

Reply via email to