[ 
https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960001#comment-15960001
 ] 

Liang-Chi Hsieh commented on SPARK-20226:
-----------------------------------------

{{spark.sql.constraintPropagation.enabled}} is a SQL config flag. I am not sure 
if your local.conf only covers Spark configuration via SparkConf. Can you 
explicitly set this flag in your application through SQLConf?

> Call to sqlContext.cacheTable takes an incredibly long time in some cases
> -------------------------------------------------------------------------
>
>                 Key: SPARK-20226
>                 URL: https://issues.apache.org/jira/browse/SPARK-20226
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>         Environment: linux or windows
>            Reporter: Barry Becker
>              Labels: cache
>         Attachments: profile_indexer2.PNG, xyzzy.csv
>
>
> I have a case where the call to sqlContext.cacheTable can take an arbitrarily 
> long time depending on the number of columns that are referenced in a 
> withColumn expression applied to a dataframe.
> The dataset is small (20 columns 7861 rows). The sequence to reproduce is the 
> following:
> 1) add a new column that references 8 - 14 of the columns in the dataset. 
>    - If I add 8 columns, then the call to cacheTable is fast - like *5 
> seconds*
>    - If I add 11 columns, then it is slow - like *60 seconds*
>    - and if I add 14 columns, then it basically *takes forever* - I gave up 
> after 10 minutes or so.
>       The Column expression that is added, is basically just concatenating 
> the columns together in a single string. If a number is concatenated on a 
> string (or vice versa) the number is first converted to a string.
>       The expression looks something like this:
> {code}
> `Plate` + `State` + `License Type` + `Summons Number` + `Issue Date` + 
> `Violation Time` + `Violation` + `Judgment Entry Date` + `Fine Amount` + 
> `Penalty Amount` + `Interest Amount`
> {code}
>         which we then convert to a Column expression that looks like this:
> {code}
> UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF('Plate, 'State), 'License Type), 
> UDF('Summons Number)), UDF('Issue Date)), 'Violation Time), 'Violation), 
> UDF('Judgment Entry Date)), UDF('Fine Amount)), UDF('Penalty Amount)), 
> UDF('Interest Amount))
> {code}
>        where the UDFs are very simple functions that basically call toString 
> and + as needed.
> 2) apply a pipeline that includes some transformers that was saved earlier. 
> Here are the steps of the pipeline (extracted from parquet)
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333200603,"sparkVersion":"2.1.0","uid":"strIdx_aeb04d2777cc","paramMap":{"handleInvalid":"skip","outputCol":"State_IDX__","inputCol":"State_CLEANED__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333200837,"sparkVersion":"2.1.0","uid":"strIdx_0164c4c13979","paramMap":{"inputCol":"License
>  Type_CLEANED__","handleInvalid":"skip","outputCol":"License 
> Type_IDX__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201068,"sparkVersion":"2.1.0","uid":"strIdx_25b6cbd02751","paramMap":{"inputCol":"Violation_CLEANED__","handleInvalid":"skip","outputCol":"Violation_IDX__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201282,"sparkVersion":"2.1.0","uid":"strIdx_aa12df0354d9","paramMap":{"handleInvalid":"skip","inputCol":"County_CLEANED__","outputCol":"County_IDX__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201552,"sparkVersion":"2.1.0","uid":"strIdx_babb120f3cc1","paramMap":{"handleInvalid":"skip","outputCol":"Issuing
>  Agency_IDX__","inputCol":"Issuing Agency_CLEANED__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201759,"sparkVersion":"2.1.0","uid":"strIdx_5f2de9d9542d","paramMap":{"handleInvalid":"skip","outputCol":"Violation
>  Status_IDX__","inputCol":"Violation Status_CLEANED__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333201987,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_6f65ca9fa813",
>       "paramMap":{
>         "outputCol":"Summons 
> Number_BINNED__","handleInvalid":"keep","splits":["-Inf",1.386630656E9,3.696078592E9,4.005258752E9,6.045063168E9,8.136507392E9,"Inf"],"inputCol":"Summons
>  Number_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202079,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_f5db4fb8120e",
>     "paramMap":{
>          
> "splits":["-Inf",1.435215616E9,1.443855616E9,1.447271936E9,1.448222464E9,1.448395264E9,1.448481536E9,1.448827136E9,1.449259264E9,1.449432064E9,1.449518336E9,"Inf"],
>           "handleInvalid":"keep","outputCol":"Issue 
> Date_BINNED__","inputCol":"Issue Date_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202172,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_74568a2a5cfd",
>       "paramMap":{
>         "handleInvalid":"keep","outputCol":"Fine 
> Amount_BINNED__","inputCol":"Fine 
> Amount_CLEANED__","splits":["-Inf",47.5,57.5,62.5,105.0,"Inf"]
>        }
>       }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202269,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_109705dfdbcd",
>       
> "paramMap":{"splits":["-Inf",0.004999999888241291,"Inf"],"outputCol":"Interest
>  Amount_BINNED__","handleInvalid":"keep","inputCol":"Interest 
> Amount_CLEANED__"}
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202362,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_2b2e3d8a324f",
>       "paramMap":{
>          "handleInvalid":"keep","inputCol":"Reduction 
> Amount_CLEANED__","outputCol":"Reduction Amount_BINNED__",
>          "splits":["-Inf",5.994999885559082,24.0,41.0,57.5,120.0,"Inf"]
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202485,"sparkVersion":"2.1.0",
>      "uid":"bucketizer_4d44c2ebf489",
>      "paramMap":{
>        
> "splits":["-Inf",18.75,42.5,52.5,57.5,70.0050048828125,75.96499633789062,100.58499908447266,115.4949951171875,125.02000427246094,"Inf"],"handleInvalid":"keep",
>          "outputCol":"Payment Amount_BINNED__","inputCol":"Payment 
> Amount_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202587,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_05a75eeef997",
>       "paramMap":{
>          "handleInvalid":"keep",
>          
> "splits":["-Inf",32.904998779296875,55.12000274658203,72.5,91.69999694824219,116.05500030517578,125.02999877929688,"Inf"],
>          "outputCol":"Amount Due_BINNED__","inputCol":"Amount Due_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202678,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_64b3ef2f97cf",
>       
> "paramMap":{"outputCol":"Precinct_BINNED__","handleInvalid":"keep","inputCol":"Precinct_CLEANED__","splits":["-Inf",0.5,23.5,"Inf"]}
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.VectorAssembler","timestamp":1491333202774,"sparkVersion":"2.1.0",
>     "uid":"vecAssembler_932758a8f18e",
>       "paramMap":{
>         "outputCol":"_features_column__",
>         "inputCols":["State_IDX__","License 
> Type_IDX__","Violation_IDX__","County_IDX__","Issuing 
> Agency_IDX__","Violation Status_IDX__","Summons Number_BINNED__","Issue 
> Date_BINNED__","Fine Amount_BINNED__","Interest Amount_BINNED__","Reduction 
> Amount_BINNED__","Payment Amount_BINNED__","Amount 
> Due_BINNED__","Precinct_BINNED__"]
>       }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.classification.NaiveBayesModel","timestamp":1491333202874,"sparkVersion":"2.1.0",
>     "uid":"nb_e4b24f3c08b0",
>       "paramMap":{
>         "probabilityCol":"_class_probability_column__",
>         "labelCol":"Penalty Amount_BINNED__",
>         "predictionCol":"_prediction_column_",
>         "modelType":"multinomial",
>         "featuresCol":"_features_column__",
>         "rawPredictionCol":"rawPrediction",
>         "smoothing":3.518236190922951E-4
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.SQLTransformer","timestamp":1491333203106,"sparkVersion":"2.1.0",
>     "uid":"sql_1ea4c1b5c52e",
>       "paramMap":{"statement":"SELECT *, CAST(_prediction_column_ AS INT) AS 
> `_*_prediction_label_column_*__` FROM __THIS__ /*cutInfo:[10.0,25.0]*/"}
>    }{code}
>    3) Call cacheTable on sqlContext. The actual code used is:
>    {code}
>     val key = "foo"
>     if (sqlContext.tableNames.contains(key))
>       sqlContext.dropTempTable(key)
>     df.createOrReplaceTempView(key)
>     sqlContext.cacheTable(key)        <-- this takes a very long time
> {code}
> When I step through cacheTable in the debugger (in CacheManager.cacheQuery), 
> I see that the query "planToCache" is very large (see below). 
> I don't know much about query plans. Is this sort of giant nested query plan 
> expected in this case? Is it in any way typical? Does it explain why it takes 
> a very long time to cache? Why would adding just a few more columns to the 
> add column expression result in a plan that takes exponentially longer?
> {code}
> SubqueryAlias foo123, `foo123`
> +- Project [Plate#123, State#124, License Type#125, Summons Number#126, Issue 
> Date#127, Violation Time#128, Violation#129, Judgment Entry Date#130, Fine 
> Amount#131, Penalty Amount#132, Interest Amount#133, Reduction Amount#134, 
> Payment Amount#135, Amount Due#136, Precinct#137, County#138, Issuing 
> Agency#139, Violation Status#140, columnBasedOnManyCols#141, Penalty Amount 
> (predicted)#2363]
>    +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 33 more fields]
>       +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 33 more fields]
>          +- SubqueryAlias sql_1ea4c1b5c52e_5640c7097aca, 
> `sql_1ea4c1b5c52e_5640c7097aca`
>             +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 32 more fields]
>                +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 31 more fields]
>                   +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 30 more fields]
>                      +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 29 more fields]
>                         +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 28 more fields]
>                            +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 27 more fields]
>                               +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 26 more fields]
>                                  +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 25 more fields]
>                                     +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 24 more fields]
>                                        +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 23 more fields]
>                                           +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 22 more fields]
>                                              +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 21 more fields]
>                                                 +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 20 more fields]
>                                                    +- Filter UDF(Violation 
> Status_CLEANED__#174)
>                                                       +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 19 more fields]
>                                                          +- Filter 
> UDF(Issuing Agency_CLEANED__#173)
>                                                             +- Project 
> [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, License 
> Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons 
> Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, 
> Violation Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, 
> Judgment Entry Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, 
> Fine Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 18 more fields]
>                                                                +- Filter 
> UDF(County_CLEANED__#172)
>                                                                   +- Project 
> [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, License 
> Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons 
> Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, 
> Violation Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, 
> Judgment Entry Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, 
> Fine Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 17 more fields]
>                                                                      +- 
> Filter UDF(Violation_CLEANED__#167)
>                                                                         +- 
> Project [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, 
> License Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, 
> Summons Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation 
> Time#128, Violation Time_CLEANED__#166, Violation#129, 
> Violation_CLEANED__#167, Judgment Entry Date#130, Judgment Entry 
> Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 16 more fields]
>                                                                            +- 
> Filter UDF(License Type_CLEANED__#164)
>                                                                               
> +- Project [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, 
> License Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, 
> Summons Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation 
> Time#128, Violation Time_CLEANED__#166, Violation#129, 
> Violation_CLEANED__#167, Judgment Entry Date#130, Judgment Entry 
> Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 15 more fields]
>                                                                               
>    +- Filter UDF(State_CLEANED__#163)
>                                                                               
>       +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, CASE WHEN 
> isnull(Summons Number#126) THEN NaN ELSE Summons Number#126 END AS Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, CASE WHEN isnull(Interest 
> Amount#133) THEN NaN ELSE Interest Amount#133 END AS Interest 
> Amount_CLEANED__#250, Interest Amount#133, CASE WHEN isnull(Reduction 
> Amount#134) THEN NaN ELSE Reduction Amount#134 END AS Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 14 more fields]
>                                                                               
>          +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number#126, Issue Date#127, CASE WHEN isnull(Issue Date_CLEANED__#165) THEN 
> NaN ELSE Issue Date_CLEANED__#165 END AS Issue Date_CLEANED__#210, Violation 
> Time#128, Violation Time_CLEANED__#166, Violation#129, 
> Violation_CLEANED__#167, Judgment Entry Date#130, CASE WHEN isnull(Judgment 
> Entry Date_CLEANED__#168) THEN NaN ELSE Judgment Entry Date_CLEANED__#168 END 
> AS Judgment Entry Date_CLEANED__#211, Fine Amount#131, CASE WHEN isnull(Fine 
> Amount_CLEANED__#169) THEN NaN ELSE Fine Amount_CLEANED__#169 END AS Fine 
> Amount_CLEANED__#212, Penalty Amount#132, CASE WHEN isnull(Penalty 
> Amount_CLEANED__#170) THEN NaN ELSE Penalty Amount_CLEANED__#170 END AS 
> Penalty Amount_CLEANED__#213, Interest Amount#133, Reduction Amount#134, 
> Payment Amount#135, Amount Due#136, Precinct#137, ... 9 more fields]
>                                                                               
>             +- Project [Plate#123, UDF(Plate#123) AS Plate_CLEANED__#162, 
> State#124, UDF(State#124) AS State_CLEANED__#163, License Type#125, 
> UDF(License Type#125) AS License Type_CLEANED__#164, Summons Number#126, 
> Issue Date#127, cast(Issue Date#127 as double) AS Issue Date_CLEANED__#165, 
> Violation Time#128, UDF(Violation Time#128) AS Violation Time_CLEANED__#166, 
> Violation#129, UDF(Violation#129) AS Violation_CLEANED__#167, Judgment Entry 
> Date#130, cast(Judgment Entry Date#130 as double) AS Judgment Entry 
> Date_CLEANED__#168, Fine Amount#131, cast(Fine Amount#131 as double) AS Fine 
> Amount_CLEANED__#169, Penalty Amount#132, cast(Penalty Amount#132 as double) 
> AS Penalty Amount_CLEANED__#170, Interest Amount#133, Reduction Amount#134, 
> Payment Amount#135, Amount Due#136, Precinct#137, ... 9 more fields]
>                                                                               
>                +- Project [Plate#6 AS Plate#123, State#7 AS State#124, 
> License Type#8 AS License Type#125, Summons Number#9 AS Summons Number#126, 
> Issue Date#10 AS Issue Date#127, Violation Time#11 AS Violation Time#128, 
> Violation#12 AS Violation#129, Judgment Entry Date#13 AS Judgment Entry 
> Date#130, Fine Amount#14 AS Fine Amount#131, Penalty Amount#15 AS Penalty 
> Amount#132, Interest Amount#16 AS Interest Amount#133, Reduction Amount#17 AS 
> Reduction Amount#134, Payment Amount#18 AS Payment Amount#135, Amount Due#19 
> AS Amount Due#136, Precinct#20 AS Precinct#137, County#21 AS County#138, 
> Issuing Agency#22 AS Issuing Agency#139, Violation Status#23 AS Violation 
> Status#140, columnBasedOnManyCols#43 AS columnBasedOnManyCols#141]
>                                                                               
>                   +- Project [Plate#6, State#7, License Type#8, Summons 
> Number#9, Issue Date#10, Violation Time#11, Violation#12, Judgment Entry 
> Date#13, Fine Amount#14, Penalty Amount#15, Interest Amount#16, Reduction 
> Amount#17, Payment Amount#18, Amount Due#19, Precinct#20, County#21, Issuing 
> Agency#22, Violation Status#23, 
> cast(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(Plate#6, State#7), License 
> Type#8), UDF(Summons Number#9)), UDF(Issue Date#10)), Violation Time#11), 
> Violation#12), UDF(Judgment Entry Date#13)), UDF(Fine Amount#14)), 
> UDF(Penalty Amount#15)), UDF(Interest Amount#16)) as string) AS 
> columnBasedOnManyCols#43]
>                                                                               
>                      +- Relation[Plate#6,State#7,License Type#8,Summons 
> Number#9,Issue Date#10,Violation Time#11,Violation#12,Judgment Entry 
> Date#13,Fine Amount#14,Penalty Amount#15,Interest Amount#16,Reduction 
> Amount#17,Payment Amount#18,Amount Due#19,Precinct#20,County#21,Issuing 
> Agency#22,Violation Status#23] csv
> {code}        



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to