[ 
https://issues.apache.org/jira/browse/SPARK-25094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578127#comment-16578127
 ] 

Izek Greenfield edited comment on SPARK-25094 at 8/13/18 1:03 PM:
------------------------------------------------------------------

the code that creates this plan is very complex. 
I will try to reproduce it in simple code in the meanwhile I can attach the 
generated code so you can see the problem is that the code does not create 
functions and inline all the Plan into the processNext method. 
[^generated_code.txt]  

it contains 2 DataFrames on with 80 columns 10 of them built from `case when` 
expressions:
 like that: 
CASE WHEN (`predefined_hc` IS NOT NULL) THEN '/Predefined_hc/' WHEN 
(`zero_volatility_adj_ind` = 'Y') THEN '/Zero_Haircuts_cases/' WHEN 
(`collateral_allocation_method` = 'FCSM') THEN '/FCSM_Collaterals/' WHEN 
((((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) AND 
((`instrument_cqs_st` <= 4) OR ((`instrument_cqs_st` = 7) AND 
(`instrument_cqs_lt` <= 4)))) AND (`residual_maturity_instrument` <= 12.0D)) 
THEN '/Debt/Central_Government_Issuer/Eligible/res_mat_1Y/' WHEN 
((((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) AND 
((`instrument_cqs_st` <= 4) OR ((`instrument_cqs_st` = 7) AND 
(`instrument_cqs_lt` <= 4)))) AND (`residual_maturity_instrument` <= 60.0D)) 
THEN '/Debt/Central_Government_Issuer/Eligible/res_mat_5Y/' WHEN 
(((`underlying_type` = 'DEBT') AND (`issuer_type` = 'CGVT')) AND 
((`instrument_cqs_st` <= 4) OR ((`instrument_cqs_st` = 7) AND 
(`instrument_cqs_lt` <= 4)))) THEN 
'/Debt/Central_Government_Issuer/Eligible/res_mat_G5/' WHEN ((`underlying_type` 
= 'DEBT') AND (`issuer_type` = 'CGVT')) THEN 
'/Debt/Central_Government_Issuer/Non_Eligible/' WHEN ((((`underlying_type` = 
'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) 
AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND 
(`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 12.0D)) 
THEN '/Debt/Other_Issuers/Eligible/res_mat_1Y/' WHEN ((((`underlying_type` = 
'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) 
AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND 
(`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 60.0D)) 
THEN '/Debt/Other_Issuers/Eligible/res_mat_5Y/' WHEN (((`underlying_type` = 
'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 'PSE', 'RGLA', 'IO_LISTED'))) 
AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) AND 
(`instrument_cqs_lt` <= 3)))) THEN '/Debt/Other_Issuers/Eligible/res_mat_G5/' 
WHEN ((`underlying_type` = 'DEBT') AND (`issuer_type` IN ('INST', 'CORP', 
'PSE', 'RGLA', 'IO_LISTED'))) THEN '/Debt/Other_Issuers/Non_Eligible/' WHEN 
(((`underlying_type` = 'SECURITISATION') AND ((`instrument_cqs_st` <= 3) OR 
((`instrument_cqs_st` = 7) AND (`instrument_cqs_lt` <= 3)))) AND 
(`residual_maturity_instrument` <= 12.0D)) THEN 
'/Securitisation/Eligible/res_mat_1Y/' WHEN (((`underlying_type` = 
'SECURITISATION') AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) 
AND (`instrument_cqs_lt` <= 3)))) AND (`residual_maturity_instrument` <= 
60.0D)) THEN '/Securitisation/Eligible/res_mat_5Y/' WHEN ((`underlying_type` = 
'SECURITISATION') AND ((`instrument_cqs_st` <= 3) OR ((`instrument_cqs_st` = 7) 
AND (`instrument_cqs_lt` <= 3)))) THEN '/Securitisation/Eligible/res_mat_G5/' 
WHEN (`underlying_type` = 'SECURITISATION') THEN 
'/Securitisation/Non_Eligible/' WHEN ((`underlying_type` IN ('EQUITY', 
'MAIN_INDEX_EQUITY', 'COMMODITY', 'NON_ELIGIBLE_SECURITY')) AND 
(`underlying_type` = 'MAIN_INDEX_EQUITY')) THEN '/Other_Securities/Main_index/' 
WHEN (`underlying_type` IN ('EQUITY', 'MAIN_INDEX_EQUITY', 'COMMODITY', 
'NON_ELIGIBLE_SECURITY')) THEN '/Other_Securities/Others/' WHEN 
(`underlying_type` = 'CASH') THEN '/Cash/' WHEN (`underlying_type` = 'GOLD') 
THEN '/Gold/' WHEN (`underlying_type` = 'CIU') THEN '/CIU/' WHEN true THEN 
'/Others/' END AS 
`108_0___Portfolio_CRD4_Art_224_Volatility_Adjustments_Codes____path_CRD4_Art_224_Volatil`


was (Author: igreenfi):
the code that creates this plan is very complex. 
I will try to reproduce it in simple code in the meanwhile I can attach the 
generated code so you can see the problem is that the code does not create 
functions and inline all the Plan into the processNext method. 
[^generated_code.txt]  

> proccesNext() failed to compile size is over 64kb
> -------------------------------------------------
>
>                 Key: SPARK-25094
>                 URL: https://issues.apache.org/jira/browse/SPARK-25094
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Izek Greenfield
>            Priority: Major
>         Attachments: generated_code.txt
>
>
> I have this tree:
> 2018-08-12T07:14:31,289 WARN  [] 
> org.apache.spark.sql.execution.WholeStageCodegenExec - Whole-stage codegen 
> disabled for plan (id=1):
>  *(1) Project [, ... 10 more fields]
> +- *(1) Filter NOT exposure_calc_method#10141 IN 
> (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)
>    +- InMemoryTableScan [, ... 11 more fields], [NOT 
> exposure_calc_method#10141 IN (UNSETTLED_TRANSACTIONS,FREE_DELIVERIES)]
>          +- InMemoryRelation [, ... 80 more fields], StorageLevel(memory, 
> deserialized, 1 replicas)
>                +- *(5) SortMergeJoin [unique_id#8506], [unique_id#8722], Inner
>                   :- *(2) Sort [unique_id#8506 ASC NULLS FIRST], false, 0
>                   :  +- Exchange(coordinator id: 1456511137) 
> UnknownPartitioning(9), coordinator[target post-shuffle partition size: 
> 67108864]
>                   :     +- *(1) Project [, ... 6 more fields]
>                   :        +- *(1) Filter (((((isnotnull(v#49) && 
> isnotnull(run_id#52)) && (asof_date#48 <=> 17531)) && (run_id#52 = DATA_REG)) 
> && (v#49 = DATA_REG)) && isnotnull(unique_id#39))
>                   :           +- InMemoryTableScan [, ... 6 more fields], [, 
> ... 6 more fields]
>                   :                 +- InMemoryRelation [, ... 6 more 
> fields], StorageLevel(memory, deserialized, 1 replicas)
>                   :                       +- *(1) FileScan csv [,... 6 more 
> fields] , ... 6 more fields
>                   +- *(4) Sort [unique_id#8722 ASC NULLS FIRST], false, 0
>                      +- Exchange(coordinator id: 1456511137) 
> UnknownPartitioning(9), coordinator[target post-shuffle partition size: 
> 67108864]
>                         +- *(3) Project [, ... 74 more fields]
>                            +- *(3) Filter (((isnotnull(v#51) && (asof_date#42 
> <=> 17531)) && (v#51 = DATA_REG)) && isnotnull(unique_id#54))
>                               +- InMemoryTableScan [, ... 74 more fields], [, 
> ... 4 more fields]
>                                     +- InMemoryRelation [, ... 74 more 
> fields], StorageLevel(memory, deserialized, 1 replicas)
>                                           +- *(1) FileScan csv [,... 74 more 
> fields] , ... 6 more fields
> Compiling "GeneratedClass": Code of method "processNext()V" of class 
> "org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1"
>  grows beyond 64 KB
> and the generated code failed to compile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to