[jira] [Comment Edited] (HIVE-26383) OOM during join query

2022-07-11 Thread Pravin Sinha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564958#comment-17564958
 ] 

Pravin Sinha edited comment on HIVE-26383 at 7/11/22 11:51 AM:
---

[~asolimando]  Haven't tried that, but will check. Earlier on all empty tables 
on older branch( branch-3.1 IIRC) which also had similar issue trimming did 
help, meaning the issue was no more reproducible.


was (Author: pkumarsinha):
[~asolimando]  Haven't tried that, but will check. Earlier on all empty tables 
on older branch( branch-3.1 IIRC) which also had similar issue trimming did 
help.

> OOM during join query
> -
>
> Key: HIVE-26383
> URL: https://issues.apache.org/jira/browse/HIVE-26383
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Priority: Major
>
> {code:java}
> [ERROR] 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[innerjoin_cal_with_insert]
>   Time elapsed: 100.73 s  <<< ERROR!
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>   at java.util.HashMap.newTreeNode(HashMap.java:1784)
>   at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:2029)
>   at java.util.HashMap.putVal(HashMap.java:639)
>   at java.util.HashMap.put(HashMap.java:613)
>   at java.util.HashSet.add(HashSet.java:220)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.EstimateUniqueKeys.getUniqueKeys(EstimateUniqueKeys.java:229)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.EstimateUniqueKeys.getUniqueKeys(EstimateUniqueKeys.java:304)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.isKey(HiveRelMdRowCount.java:501)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.analyzeJoinForPKFK(HiveRelMdRowCount.java:302)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:102)
>   at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
>   at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
>   at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1882)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1756)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1233)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addFactorToTree(LoptOptimizeJoinRule.java:927)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createOrdering(LoptOptimizeJoinRule.java:728)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.findBestOrderings(LoptOptimizeJoinRule.java:459)
>   at 
> org.apache.calcite.rel.rules.LoptOptimizeJoinRule.onMatch(LoptOptimizeJoinRule.java:128)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2468)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2427)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyJoinOrderingTransform(CalcitePlanner.java:2193)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1750)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1605)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-26383) OOM during join query

2022-07-11 Thread Pravin Sinha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564952#comment-17564952
 ] 

Pravin Sinha edited comment on HIVE-26383 at 7/11/22 11:33 AM:
---

The _*innerjoin_cal_with_insert.q*_  test file in the test doesn't exist in 
trunk code. The content used is as follows:

 
{code:java}
set hive.mapred.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;create 
database db1;create table db1.tab1
 ( csid bigint,
type_n string,
c_c_a_c string,
cn_i_g string,
c_a_k_v string,
i_p_m string,
igd string,
iec string,
cavv string,
ptt string,
dev string,
 vtt string,
apv string,
apv_ENR string,
 mndm string,
aamad string,
pnwp string,
ictch string,
eie string,
saie string,
shipg_addr_ctry_nm string,
 vco_flow_type string)
 stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");
create table db1.tab2
(
csid bigint,
usr_sid bigint,
hgp_CHNL_TYP_NM string,
USR_PYMT_CHNL_TYP_NM string,
dflt_iso_ctry_cd bigint,
hgp_rjyat_NM string,
hgp_nger_NM string,
hgp_nger_CD string,
PYMT_INSTRMT_pntz string,
ptmp_TYP_CD string,
ptmp_TYP_NM string,
pcoit_Enroll_TS string,
pcoit_USER_giffyui string,
tti_gra_ISO_rjyat_CD string,
tti_gra_rjyat_NM string,
ciic_rjyat_CD string,
ciic_rjyat_NM string,
ciic_nger_NM string,
ciic_nger_CD string)
 stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");
create table db1.tab3
(
csid bigint,
usr_extrnl_id string,
coreln_id string,
usr_Acct_typ_nm string,
tmpt_bejkl_pntz bigint,
tyui_bejkl_pntz bigint,
tmpt_bejkl_pntz_Intnt_tmpt_NM string,
tyui_bejkl_pntz_Intnt_tyui_NM string,
Intnt_Mrch_Ind string,
tmpt_API_KEY_VAL string,
Intnt_tmpt_NM string,
vasterrgln_TYP_NM string,
tmpt_ISO_rjyat_CD string,
tmpt_rjyat_NM string,
tmpt_nger_CD string,
tmpt_nger_NM string,
ctghi_GUID string,
ctghi_LGL_NM string,
ctghi_TRD_NM string,
ctghi_rjyat_NM string,
ctghi_RGN_NM string,
ctghi_RGN_CD string,
ctghi_rjyat_CD string,
Intnt_Mrch_Version string,
thtjl_gtslmnbprg_gst string,
tmpt_RM string,
tmpt_rjyat_CD string,
tmpt_LGL_NM string,
tmpt_oeiw string)
  stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");create table 
db1.tab4
(
csid bigint,
pymt_instrmt_extrnl_id string,
paresStatus string,
xid string,
Erica string,
hgty_BIN string,
ISS_gst string,
ISSUER_BID string,
hgty_rjyat_NM string,
hgty_rjyat_CD int,
hgty_nger_NM string,
hgty_nger_CD string,
ISS_HOLDG_BID_gst string,
ISS_HOLDING_BID int,
ptmp_BRND_CD string,
ACCT_NUM string,
RWRD_PGM_ID string,
RWRD_PGM_NM string,
RPIN_RLLUP_CD string,
tti_gra_PSTL_CD string,
tti_gra_PRVNC_CD string,
tkarh string)
stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");create table 
db1.tab5
(
csid bigint,
latest_ts string,
Intnt_Month string,
CORELN_ID string,
gtlop_Flag string,
gtlop_Date string,
gtlop_ts string,
krontron_Flag string,
ERROR_giffyui string,
UPDATED_giffyui string,
Abandoned_Flag string,
TOKENIZED_giffyui string,
thtjl_3DS_giffyui string,
thtjl_CANCEL_giffyui string,
thtjl_gftdrs_giffyui string,
krontron_Date string,
krontron_ts string,
stgft_Flag string,
stgft_Date string,
stgft_ts string,
I_ltt_mhy string,
I_ltt_mhy_das string,
 C_ltt_mhy string,
C_ltt_mhy_das string,
S_ltt_mhy string,
S_ltt_mhy_das  string)
stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");
create table db1.tab6
(
csid bigint,
browser_name string,
browser_version string,
browser_vendor string,
OS string,
grtprt_TYP_NM string,
dceavt  string)
stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");
create table db1.tab7
(
csid bigint,
cmpgn_callr_nm string,
cmpgn_callr_typ_nm string,
cmpgn_chnl_nm string,
cmpgn_extrnl_clnt_id string,
cmpgn_flw_nm string,
cmpgn_iso_ctry_cd bigint,
cmpgn_ni_callr_nm string,
cmpgn_ni_campgn_id string,
cmpgn_ni_cmpgn_nm string,
cmpgn_ni_flw_nm string,
cmpgn_ni_chnl_nm string,
cmpgn_ni_plcmnt_id string,
cmpgn_ni_prtflo_nm string,
 cmpgn_ni_seg_nm string,
 cmpgn_ni_site_id string,
cmpgn_rdrct_url_addr string,
cmpgn_usr_agnt_nm string,
cmpgn_clnt_api_key_id string,
CRDNL_vptpspgp_ID string )
stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");
create table db1.tab8
(
csid bigint,
ino_src_client_name string,
 ino_srci_clnt_id string,
 ino_usr_extrnl_id_typ_nm string,
ino_src_ntwrk_nm string,
 ino_crptgm_typ string,
 ino_client_typ_nm string,
ino_legal_name string,
ino_trade_name string,
pyt_upd_tran_typ_cd string,
 pyt_upd_success_flg  string)
stored as PARQUET TBLPROPERTIES ("parquet.compress"="SNAPPY");insert into 
db1.tab1 values (1,  '109515', null, 'test', 'test', '2018-01-10 15:03:55.0', 
'2018-01-10', 109515, null, '45045501', 'id', null,'test', 'test', '2018-01-10 
15:03:55.0', '2018-01-10', 109515, null, '45045501', 'id', null,null);
insert into db1.tab2 values (1,  '109515', '11', 'test', '2018-01-10 
15:03:55.0', '2018-01-10', 109515, null, '45045501', 'id', null,'test', 'test', 
'2018-01-10 15:03:55.0', '2018-01-10', 109515,