[ 
https://issues.apache.org/jira/browse/SPARK-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959024#comment-15959024
 ] 

Barry Becker edited comment on SPARK-20226 at 4/6/17 2:45 PM:
--------------------------------------------------------------

I set spark.sql.constraintPropagation.enabled to false in job-server local.conf 
and tried again.
It did not help. It still took about 2 minutes. Oddly, setting it to true 
seemed to make it worse.
I did find something that did work though. If I simply call cache() on the 
dataframe after the add column (right after step 1 above)
then it runs very quickly. The time spent in cacheTable goes from 60 seconds to 
0.5 seconds. I don't understand why though.
I thought calling cache would only help if there was branching, but the 
pipeline is linear isn't it?

Here is what the query plan looks like in the call to cache the dataframe 
before transforming with the pipeline.
{code}
Project [Plate#6, State#7, License Type#8, Summons Number#9, Issue Date#10, 
Violation Time#11, Violation#12, Judgment Entry Date#13, Fine Amount#14, 
Penalty Amount#15, Interest Amount#16, Reduction Amount#17, Payment Amount#18, 
Amount Due#19, Precinct#20, County#21, Issuing Agency#22, Violation Status#23, 
cast(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(Plate#6, State#7), License 
Type#8), Violation Time#11), Violation#12), UDF(Judgment Entry Date#13)), 
UDF(Issue Date#10)), UDF(Summons Number#9)), UDF(Fine Amount#14)), UDF(Penalty 
Amount#15)), UDF(Interest Amount#16)) as string) AS columnBasedOnManyCols#43]
+- Relation[Plate#6,State#7,License Type#8,Summons Number#9,Issue 
Date#10,Violation Time#11,Violation#12,Judgment Entry Date#13,Fine 
Amount#14,Penalty Amount#15,Interest Amount#16,Reduction Amount#17,Payment 
Amount#18,Amount Due#19,Precinct#20,County#21,Issuing Agency#22,Violation 
Status#23] csv
{code}
Here is how the query plan now looks in the call to cacheTable after 
transforming with the pipeline. Looks fairly similar to what it was before, but 
now its fast.
{code}
SubqueryAlias foo123, `foo123`
+- Project [Plate#236, State#237, License Type#238, Summons Number#239, Issue 
Date#240, Violation Time#241, Violation#242, Judgment Entry Date#243, Fine 
Amount#244, Penalty Amount#245, Interest Amount#246, Reduction Amount#247, 
Payment Amount#248, Amount Due#249, Precinct#250, County#251, Issuing 
Agency#252, Violation Status#253, columnBasedOnManyCols#254, Penalty Amount 
(predicted)#2476]
   +- Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, 
Summons Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, 
Fine Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 33 more fields]
      +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
33 more fields]
         +- SubqueryAlias sql_1ea4c1b5c52e_cd062499a688, 
`sql_1ea4c1b5c52e_cd062499a688`
            +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
32 more fields]
               +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
31 more fields]
                  +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
30 more fields]
                     +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
29 more fields]
                        +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
28 more fields]
                           +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
27 more fields]
                              +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
26 more fields]
                                 +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
25 more fields]
                                    +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
24 more fields]
                                       +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 23 more fields]
                                          +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 22 more fields]
                                             +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 21 more fields]
                                                +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 20 more fields]
                                                   +- Filter UDF(Violation 
Status_CLEANED__#287)
                                                      +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 19 more fields]
                                                         +- Filter UDF(Issuing 
Agency_CLEANED__#286)
                                                            +- Project 
[Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, License 
Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons 
Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation Time#241, 
Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment 
Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 18 more fields]
                                                               +- Filter 
UDF(County_CLEANED__#285)
                                                                  +- Project 
[Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, License 
Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons 
Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation Time#241, 
Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment 
Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 17 more fields]
                                                                     +- Filter 
UDF(Violation_CLEANED__#280)
                                                                        +- 
Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, 
Summons Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, 
Fine Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 16 more fields]
                                                                           +- 
Filter UDF(License Type_CLEANED__#277)
                                                                              
+- Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, 
Summons Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, 
Fine Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 15 more fields]
                                                                                
 +- Filter UDF(State_CLEANED__#276)
                                                                                
    +- Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, CASE WHEN isnull(Summons 
Number#239) THEN NaN ELSE Summons Number#239 END AS Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, CASE WHEN isnull(Interest Amount#246) 
THEN NaN ELSE Interest Amount#246 END AS Interest Amount_CLEANED__#363, 
Interest Amount#246, CASE WHEN isnull(Reduction Amount#247) THEN NaN ELSE 
Reduction Amount#247 END AS Reduction Amount_CLEANED__#364, Reduction 
Amount#247, ... 14 more fields]
                                                                                
       +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number#239, Issue Date#240, CASE WHEN isnull(Issue Date_CLEANED__#278) THEN NaN 
ELSE Issue Date_CLEANED__#278 END AS Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, CASE WHEN isnull(Judgment Entry Date_CLEANED__#281) 
THEN NaN ELSE Judgment Entry Date_CLEANED__#281 END AS Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, CASE WHEN isnull(Fine 
Amount_CLEANED__#282) THEN NaN ELSE Fine Amount_CLEANED__#282 END AS Fine 
Amount_CLEANED__#325, Penalty Amount#245, CASE WHEN isnull(Penalty 
Amount_CLEANED__#283) THEN NaN ELSE Penalty Amount_CLEANED__#283 END AS Penalty 
Amount_CLEANED__#326, Interest Amount#246, Reduction Amount#247, Payment 
Amount#248, Amount Due#249, Precinct#250, ... 9 more fields]
                                                                                
          +- Project [Plate#236, UDF(Plate#236) AS Plate_CLEANED__#275, 
State#237, UDF(State#237) AS State_CLEANED__#276, License Type#238, UDF(License 
Type#238) AS License Type_CLEANED__#277, Summons Number#239, Issue Date#240, 
cast(Issue Date#240 as double) AS Issue Date_CLEANED__#278, Violation Time#241, 
UDF(Violation Time#241) AS Violation Time_CLEANED__#279, Violation#242, 
UDF(Violation#242) AS Violation_CLEANED__#280, Judgment Entry Date#243, 
cast(Judgment Entry Date#243 as double) AS Judgment Entry Date_CLEANED__#281, 
Fine Amount#244, cast(Fine Amount#244 as double) AS Fine Amount_CLEANED__#282, 
Penalty Amount#245, cast(Penalty Amount#245 as double) AS Penalty 
Amount_CLEANED__#283, Interest Amount#246, Reduction Amount#247, Payment 
Amount#248, Amount Due#249, Precinct#250, ... 9 more fields]
                                                                                
             +- Project [Plate#6 AS Plate#236, State#7 AS State#237, License 
Type#8 AS License Type#238, Summons Number#9 AS Summons Number#239, Issue 
Date#10 AS Issue Date#240, Violation Time#11 AS Violation Time#241, 
Violation#12 AS Violation#242, Judgment Entry Date#13 AS Judgment Entry 
Date#243, Fine Amount#14 AS Fine Amount#244, Penalty Amount#15 AS Penalty 
Amount#245, Interest Amount#16 AS Interest Amount#246, Reduction Amount#17 AS 
Reduction Amount#247, Payment Amount#18 AS Payment Amount#248, Amount Due#19 AS 
Amount Due#249, Precinct#20 AS Precinct#250, County#21 AS County#251, Issuing 
Agency#22 AS Issuing Agency#252, Violation Status#23 AS Violation Status#253, 
columnBasedOnManyCols#43 AS columnBasedOnManyCols#254]
                                                                                
                +- Project [Plate#6, State#7, License Type#8, Summons Number#9, 
Issue Date#10, Violation Time#11, Violation#12, Judgment Entry Date#13, Fine 
Amount#14, Penalty Amount#15, Interest Amount#16, Reduction Amount#17, Payment 
Amount#18, Amount Due#19, Precinct#20, County#21, Issuing Agency#22, Violation 
Status#23, cast(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(Plate#6, State#7), 
License Type#8), Violation Time#11), Violation#12), UDF(Judgment Entry 
Date#13)), UDF(Issue Date#10)), UDF(Summons Number#9)), UDF(Fine Amount#14)), 
UDF(Penalty Amount#15)), UDF(Interest Amount#16)) as string) AS 
columnBasedOnManyCols#43]
                                                                                
                   +- Relation[Plate#6,State#7,License Type#8,Summons 
Number#9,Issue Date#10,Violation Time#11,Violation#12,Judgment Entry 
Date#13,Fine Amount#14,Penalty Amount#15,Interest Amount#16,Reduction 
Amount#17,Payment Amount#18,Amount Due#19,Precinct#20,County#21,Issuing 
Agency#22,Violation Status#23] csv
{code}

Maybe this could be marked resolved. I'm not sure if there is something wrong 
here or just my lack of understanding about how spark caching works. 


was (Author: barrybecker4):
I set spark.sql.constraintPropagation.enabled to false in job-server local.conf 
and tried again.
It did not help. It still took about 2 minutes. Oddly, setting it to true 
seemed to make it worse.
I did find something that did work though. If I simply call cache() on the 
dataframe after the add column (right after step 1 above)
then it runs very quickly. The time spent in cacheTable goes from 60 seconds to 
0.5 seconds. I don't understand why though.
I thought calling cache would only help of there was branching, but the 
pipeline is linear isn't it?

Here is what the query plan looks like in the call to cache the dataframe 
before transforming with the pipeline.
{code}
Project [Plate#6, State#7, License Type#8, Summons Number#9, Issue Date#10, 
Violation Time#11, Violation#12, Judgment Entry Date#13, Fine Amount#14, 
Penalty Amount#15, Interest Amount#16, Reduction Amount#17, Payment Amount#18, 
Amount Due#19, Precinct#20, County#21, Issuing Agency#22, Violation Status#23, 
cast(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(Plate#6, State#7), License 
Type#8), Violation Time#11), Violation#12), UDF(Judgment Entry Date#13)), 
UDF(Issue Date#10)), UDF(Summons Number#9)), UDF(Fine Amount#14)), UDF(Penalty 
Amount#15)), UDF(Interest Amount#16)) as string) AS columnBasedOnManyCols#43]
+- Relation[Plate#6,State#7,License Type#8,Summons Number#9,Issue 
Date#10,Violation Time#11,Violation#12,Judgment Entry Date#13,Fine 
Amount#14,Penalty Amount#15,Interest Amount#16,Reduction Amount#17,Payment 
Amount#18,Amount Due#19,Precinct#20,County#21,Issuing Agency#22,Violation 
Status#23] csv
{code}
Here is how the query plan now looks in the call to cacheTable after 
transforming with the pipeline. Looks fairly similar to what it was before, but 
now its fast.
{code}
SubqueryAlias foo123, `foo123`
+- Project [Plate#236, State#237, License Type#238, Summons Number#239, Issue 
Date#240, Violation Time#241, Violation#242, Judgment Entry Date#243, Fine 
Amount#244, Penalty Amount#245, Interest Amount#246, Reduction Amount#247, 
Payment Amount#248, Amount Due#249, Precinct#250, County#251, Issuing 
Agency#252, Violation Status#253, columnBasedOnManyCols#254, Penalty Amount 
(predicted)#2476]
   +- Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, 
Summons Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, 
Fine Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 33 more fields]
      +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
33 more fields]
         +- SubqueryAlias sql_1ea4c1b5c52e_cd062499a688, 
`sql_1ea4c1b5c52e_cd062499a688`
            +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
32 more fields]
               +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
31 more fields]
                  +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
30 more fields]
                     +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
29 more fields]
                        +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
28 more fields]
                           +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
27 more fields]
                              +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
26 more fields]
                                 +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
25 more fields]
                                    +- Project [Plate#236, Plate_CLEANED__#275, 
State#237, State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, 
Summons Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, Interest Amount_CLEANED__#363, 
Interest Amount#246, Reduction Amount_CLEANED__#364, Reduction Amount#247, ... 
24 more fields]
                                       +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 23 more fields]
                                          +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 22 more fields]
                                             +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 21 more fields]
                                                +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 20 more fields]
                                                   +- Filter UDF(Violation 
Status_CLEANED__#287)
                                                      +- Project [Plate#236, 
Plate_CLEANED__#275, State#237, State_CLEANED__#276, License Type#238, License 
Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons Number#239, Issue 
Date#240, Issue Date_CLEANED__#323, Violation Time#241, Violation 
Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment Entry 
Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 19 more fields]
                                                         +- Filter UDF(Issuing 
Agency_CLEANED__#286)
                                                            +- Project 
[Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, License 
Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons 
Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation Time#241, 
Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment 
Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 18 more fields]
                                                               +- Filter 
UDF(County_CLEANED__#285)
                                                                  +- Project 
[Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, License 
Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, Summons 
Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation Time#241, 
Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, Judgment 
Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, Fine 
Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 17 more fields]
                                                                     +- Filter 
UDF(Violation_CLEANED__#280)
                                                                        +- 
Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, 
Summons Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, 
Fine Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 16 more fields]
                                                                           +- 
Filter UDF(License Type_CLEANED__#277)
                                                                              
+- Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, Summons Number_CLEANED__#362, 
Summons Number#239, Issue Date#240, Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, Judgment Entry Date_CLEANED__#324, Fine Amount#244, 
Fine Amount_CLEANED__#325, Penalty Amount#245, Penalty Amount_CLEANED__#326, 
Interest Amount_CLEANED__#363, Interest Amount#246, Reduction 
Amount_CLEANED__#364, Reduction Amount#247, ... 15 more fields]
                                                                                
 +- Filter UDF(State_CLEANED__#276)
                                                                                
    +- Project [Plate#236, Plate_CLEANED__#275, State#237, State_CLEANED__#276, 
License Type#238, License Type_CLEANED__#277, CASE WHEN isnull(Summons 
Number#239) THEN NaN ELSE Summons Number#239 END AS Summons 
Number_CLEANED__#362, Summons Number#239, Issue Date#240, Issue 
Date_CLEANED__#323, Violation Time#241, Violation Time_CLEANED__#279, 
Violation#242, Violation_CLEANED__#280, Judgment Entry Date#243, Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, Fine Amount_CLEANED__#325, Penalty 
Amount#245, Penalty Amount_CLEANED__#326, CASE WHEN isnull(Interest Amount#246) 
THEN NaN ELSE Interest Amount#246 END AS Interest Amount_CLEANED__#363, 
Interest Amount#246, CASE WHEN isnull(Reduction Amount#247) THEN NaN ELSE 
Reduction Amount#247 END AS Reduction Amount_CLEANED__#364, Reduction 
Amount#247, ... 14 more fields]
                                                                                
       +- Project [Plate#236, Plate_CLEANED__#275, State#237, 
State_CLEANED__#276, License Type#238, License Type_CLEANED__#277, Summons 
Number#239, Issue Date#240, CASE WHEN isnull(Issue Date_CLEANED__#278) THEN NaN 
ELSE Issue Date_CLEANED__#278 END AS Issue Date_CLEANED__#323, Violation 
Time#241, Violation Time_CLEANED__#279, Violation#242, Violation_CLEANED__#280, 
Judgment Entry Date#243, CASE WHEN isnull(Judgment Entry Date_CLEANED__#281) 
THEN NaN ELSE Judgment Entry Date_CLEANED__#281 END AS Judgment Entry 
Date_CLEANED__#324, Fine Amount#244, CASE WHEN isnull(Fine 
Amount_CLEANED__#282) THEN NaN ELSE Fine Amount_CLEANED__#282 END AS Fine 
Amount_CLEANED__#325, Penalty Amount#245, CASE WHEN isnull(Penalty 
Amount_CLEANED__#283) THEN NaN ELSE Penalty Amount_CLEANED__#283 END AS Penalty 
Amount_CLEANED__#326, Interest Amount#246, Reduction Amount#247, Payment 
Amount#248, Amount Due#249, Precinct#250, ... 9 more fields]
                                                                                
          +- Project [Plate#236, UDF(Plate#236) AS Plate_CLEANED__#275, 
State#237, UDF(State#237) AS State_CLEANED__#276, License Type#238, UDF(License 
Type#238) AS License Type_CLEANED__#277, Summons Number#239, Issue Date#240, 
cast(Issue Date#240 as double) AS Issue Date_CLEANED__#278, Violation Time#241, 
UDF(Violation Time#241) AS Violation Time_CLEANED__#279, Violation#242, 
UDF(Violation#242) AS Violation_CLEANED__#280, Judgment Entry Date#243, 
cast(Judgment Entry Date#243 as double) AS Judgment Entry Date_CLEANED__#281, 
Fine Amount#244, cast(Fine Amount#244 as double) AS Fine Amount_CLEANED__#282, 
Penalty Amount#245, cast(Penalty Amount#245 as double) AS Penalty 
Amount_CLEANED__#283, Interest Amount#246, Reduction Amount#247, Payment 
Amount#248, Amount Due#249, Precinct#250, ... 9 more fields]
                                                                                
             +- Project [Plate#6 AS Plate#236, State#7 AS State#237, License 
Type#8 AS License Type#238, Summons Number#9 AS Summons Number#239, Issue 
Date#10 AS Issue Date#240, Violation Time#11 AS Violation Time#241, 
Violation#12 AS Violation#242, Judgment Entry Date#13 AS Judgment Entry 
Date#243, Fine Amount#14 AS Fine Amount#244, Penalty Amount#15 AS Penalty 
Amount#245, Interest Amount#16 AS Interest Amount#246, Reduction Amount#17 AS 
Reduction Amount#247, Payment Amount#18 AS Payment Amount#248, Amount Due#19 AS 
Amount Due#249, Precinct#20 AS Precinct#250, County#21 AS County#251, Issuing 
Agency#22 AS Issuing Agency#252, Violation Status#23 AS Violation Status#253, 
columnBasedOnManyCols#43 AS columnBasedOnManyCols#254]
                                                                                
                +- Project [Plate#6, State#7, License Type#8, Summons Number#9, 
Issue Date#10, Violation Time#11, Violation#12, Judgment Entry Date#13, Fine 
Amount#14, Penalty Amount#15, Interest Amount#16, Reduction Amount#17, Payment 
Amount#18, Amount Due#19, Precinct#20, County#21, Issuing Agency#22, Violation 
Status#23, cast(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(Plate#6, State#7), 
License Type#8), Violation Time#11), Violation#12), UDF(Judgment Entry 
Date#13)), UDF(Issue Date#10)), UDF(Summons Number#9)), UDF(Fine Amount#14)), 
UDF(Penalty Amount#15)), UDF(Interest Amount#16)) as string) AS 
columnBasedOnManyCols#43]
                                                                                
                   +- Relation[Plate#6,State#7,License Type#8,Summons 
Number#9,Issue Date#10,Violation Time#11,Violation#12,Judgment Entry 
Date#13,Fine Amount#14,Penalty Amount#15,Interest Amount#16,Reduction 
Amount#17,Payment Amount#18,Amount Due#19,Precinct#20,County#21,Issuing 
Agency#22,Violation Status#23] csv
{code}

Maybe this could be marked resolved. I'm not sure if there is something wrong 
here or just my lack of understanding about how spark caching works. 

> Call to sqlContext.cacheTable takes an incredibly long time in some cases
> -------------------------------------------------------------------------
>
>                 Key: SPARK-20226
>                 URL: https://issues.apache.org/jira/browse/SPARK-20226
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>         Environment: linux or windows
>            Reporter: Barry Becker
>              Labels: cache
>         Attachments: profile_indexer2.PNG, xyzzy.csv
>
>
> I have a case where the call to sqlContext.cacheTable can take an arbitrarily 
> long time depending on the number of columns that are referenced in a 
> withColumn expression applied to a dataframe.
> The dataset is small (20 columns 7861 rows). The sequence to reproduce is the 
> following:
> 1) add a new column that references 8 - 14 of the columns in the dataset. 
>    - If I add 8 columns, then the call to cacheTable is fast - like *5 
> seconds*
>    - If I add 11 columns, then it is slow - like *60 seconds*
>    - and if I add 14 columns, then it basically *takes forever* - I gave up 
> after 10 minutes or so.
>       The Column expression that is added, is basically just concatenating 
> the columns together in a single string. If a number is concatenated on a 
> string (or vice versa) the number is first converted to a string.
>       The expression looks something like this:
> {code}
> `Plate` + `State` + `License Type` + `Summons Number` + `Issue Date` + 
> `Violation Time` + `Violation` + `Judgment Entry Date` + `Fine Amount` + 
> `Penalty Amount` + `Interest Amount`
> {code}
>         which we then convert to a Column expression that looks like this:
> {code}
> UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF('Plate, 'State), 'License Type), 
> UDF('Summons Number)), UDF('Issue Date)), 'Violation Time), 'Violation), 
> UDF('Judgment Entry Date)), UDF('Fine Amount)), UDF('Penalty Amount)), 
> UDF('Interest Amount))
> {code}
>        where the UDFs are very simple functions that basically call toString 
> and + as needed.
> 2) apply a pipeline that includes some transformers that was saved earlier. 
> Here are the steps of the pipeline (extracted from parquet)
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333200603,"sparkVersion":"2.1.0","uid":"strIdx_aeb04d2777cc","paramMap":{"handleInvalid":"skip","outputCol":"State_IDX__","inputCol":"State_CLEANED__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333200837,"sparkVersion":"2.1.0","uid":"strIdx_0164c4c13979","paramMap":{"inputCol":"License
>  Type_CLEANED__","handleInvalid":"skip","outputCol":"License 
> Type_IDX__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201068,"sparkVersion":"2.1.0","uid":"strIdx_25b6cbd02751","paramMap":{"inputCol":"Violation_CLEANED__","handleInvalid":"skip","outputCol":"Violation_IDX__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201282,"sparkVersion":"2.1.0","uid":"strIdx_aa12df0354d9","paramMap":{"handleInvalid":"skip","inputCol":"County_CLEANED__","outputCol":"County_IDX__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201552,"sparkVersion":"2.1.0","uid":"strIdx_babb120f3cc1","paramMap":{"handleInvalid":"skip","outputCol":"Issuing
>  Agency_IDX__","inputCol":"Issuing Agency_CLEANED__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.StringIndexerModel","timestamp":1491333201759,"sparkVersion":"2.1.0","uid":"strIdx_5f2de9d9542d","paramMap":{"handleInvalid":"skip","outputCol":"Violation
>  Status_IDX__","inputCol":"Violation Status_CLEANED__"}}{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333201987,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_6f65ca9fa813",
>       "paramMap":{
>         "outputCol":"Summons 
> Number_BINNED__","handleInvalid":"keep","splits":["-Inf",1.386630656E9,3.696078592E9,4.005258752E9,6.045063168E9,8.136507392E9,"Inf"],"inputCol":"Summons
>  Number_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202079,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_f5db4fb8120e",
>     "paramMap":{
>          
> "splits":["-Inf",1.435215616E9,1.443855616E9,1.447271936E9,1.448222464E9,1.448395264E9,1.448481536E9,1.448827136E9,1.449259264E9,1.449432064E9,1.449518336E9,"Inf"],
>           "handleInvalid":"keep","outputCol":"Issue 
> Date_BINNED__","inputCol":"Issue Date_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202172,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_74568a2a5cfd",
>       "paramMap":{
>         "handleInvalid":"keep","outputCol":"Fine 
> Amount_BINNED__","inputCol":"Fine 
> Amount_CLEANED__","splits":["-Inf",47.5,57.5,62.5,105.0,"Inf"]
>        }
>       }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202269,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_109705dfdbcd",
>       
> "paramMap":{"splits":["-Inf",0.004999999888241291,"Inf"],"outputCol":"Interest
>  Amount_BINNED__","handleInvalid":"keep","inputCol":"Interest 
> Amount_CLEANED__"}
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202362,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_2b2e3d8a324f",
>       "paramMap":{
>          "handleInvalid":"keep","inputCol":"Reduction 
> Amount_CLEANED__","outputCol":"Reduction Amount_BINNED__",
>          "splits":["-Inf",5.994999885559082,24.0,41.0,57.5,120.0,"Inf"]
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202485,"sparkVersion":"2.1.0",
>      "uid":"bucketizer_4d44c2ebf489",
>      "paramMap":{
>        
> "splits":["-Inf",18.75,42.5,52.5,57.5,70.0050048828125,75.96499633789062,100.58499908447266,115.4949951171875,125.02000427246094,"Inf"],"handleInvalid":"keep",
>          "outputCol":"Payment Amount_BINNED__","inputCol":"Payment 
> Amount_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202587,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_05a75eeef997",
>       "paramMap":{
>          "handleInvalid":"keep",
>          
> "splits":["-Inf",32.904998779296875,55.12000274658203,72.5,91.69999694824219,116.05500030517578,125.02999877929688,"Inf"],
>          "outputCol":"Amount Due_BINNED__","inputCol":"Amount Due_CLEANED__"
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.Bucketizer","timestamp":1491333202678,"sparkVersion":"2.1.0",
>     "uid":"bucketizer_64b3ef2f97cf",
>       
> "paramMap":{"outputCol":"Precinct_BINNED__","handleInvalid":"keep","inputCol":"Precinct_CLEANED__","splits":["-Inf",0.5,23.5,"Inf"]}
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.VectorAssembler","timestamp":1491333202774,"sparkVersion":"2.1.0",
>     "uid":"vecAssembler_932758a8f18e",
>       "paramMap":{
>         "outputCol":"_features_column__",
>         "inputCols":["State_IDX__","License 
> Type_IDX__","Violation_IDX__","County_IDX__","Issuing 
> Agency_IDX__","Violation Status_IDX__","Summons Number_BINNED__","Issue 
> Date_BINNED__","Fine Amount_BINNED__","Interest Amount_BINNED__","Reduction 
> Amount_BINNED__","Payment Amount_BINNED__","Amount 
> Due_BINNED__","Precinct_BINNED__"]
>       }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.classification.NaiveBayesModel","timestamp":1491333202874,"sparkVersion":"2.1.0",
>     "uid":"nb_e4b24f3c08b0",
>       "paramMap":{
>         "probabilityCol":"_class_probability_column__",
>         "labelCol":"Penalty Amount_BINNED__",
>         "predictionCol":"_prediction_column_",
>         "modelType":"multinomial",
>         "featuresCol":"_features_column__",
>         "rawPredictionCol":"rawPrediction",
>         "smoothing":3.518236190922951E-4
>        }
>    }{code}
>  - 
> {code}{"class":"org.apache.spark.ml.feature.SQLTransformer","timestamp":1491333203106,"sparkVersion":"2.1.0",
>     "uid":"sql_1ea4c1b5c52e",
>       "paramMap":{"statement":"SELECT *, CAST(_prediction_column_ AS INT) AS 
> `_*_prediction_label_column_*__` FROM __THIS__ /*cutInfo:[10.0,25.0]*/"}
>    }{code}
>    3) Call cacheTable on sqlContext. The actual code used is:
>    {code}
>     val key = "foo"
>     if (sqlContext.tableNames.contains(key))
>       sqlContext.dropTempTable(key)
>     df.createOrReplaceTempView(key)
>     sqlContext.cacheTable(key)        <-- this takes a very long time
> {code}
> When I step through cacheTable in the debugger (in CacheManager.cacheQuery), 
> I see that the query "planToCache" is very large (see below). 
> I don't know much about query plans. Is this sort of giant nested query plan 
> expected in this case? Is it in any way typical? Does it explain why it takes 
> a very long time to cache? Why would adding just a few more columns to the 
> add column expression result in a plan that takes exponentially longer?
> {code}
> SubqueryAlias foo123, `foo123`
> +- Project [Plate#123, State#124, License Type#125, Summons Number#126, Issue 
> Date#127, Violation Time#128, Violation#129, Judgment Entry Date#130, Fine 
> Amount#131, Penalty Amount#132, Interest Amount#133, Reduction Amount#134, 
> Payment Amount#135, Amount Due#136, Precinct#137, County#138, Issuing 
> Agency#139, Violation Status#140, columnBasedOnManyCols#141, Penalty Amount 
> (predicted)#2363]
>    +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 33 more fields]
>       +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 33 more fields]
>          +- SubqueryAlias sql_1ea4c1b5c52e_5640c7097aca, 
> `sql_1ea4c1b5c52e_5640c7097aca`
>             +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 32 more fields]
>                +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 31 more fields]
>                   +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 30 more fields]
>                      +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 29 more fields]
>                         +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 28 more fields]
>                            +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 27 more fields]
>                               +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 26 more fields]
>                                  +- Project [Plate#123, Plate_CLEANED__#162, 
> State#124, State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, 
> Summons Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 25 more fields]
>                                     +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 24 more fields]
>                                        +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 23 more fields]
>                                           +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 22 more fields]
>                                              +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 21 more fields]
>                                                 +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 20 more fields]
>                                                    +- Filter UDF(Violation 
> Status_CLEANED__#174)
>                                                       +- Project [Plate#123, 
> Plate_CLEANED__#162, State#124, State_CLEANED__#163, License Type#125, 
> License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons Number#126, 
> Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, Violation 
> Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, Judgment Entry 
> Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, Fine 
> Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 19 more fields]
>                                                          +- Filter 
> UDF(Issuing Agency_CLEANED__#173)
>                                                             +- Project 
> [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, License 
> Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons 
> Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, 
> Violation Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, 
> Judgment Entry Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, 
> Fine Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 18 more fields]
>                                                                +- Filter 
> UDF(County_CLEANED__#172)
>                                                                   +- Project 
> [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, License 
> Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, Summons 
> Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation Time#128, 
> Violation Time_CLEANED__#166, Violation#129, Violation_CLEANED__#167, 
> Judgment Entry Date#130, Judgment Entry Date_CLEANED__#211, Fine Amount#131, 
> Fine Amount_CLEANED__#212, Penalty Amount#132, Penalty Amount_CLEANED__#213, 
> Interest Amount_CLEANED__#250, Interest Amount#133, Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 17 more fields]
>                                                                      +- 
> Filter UDF(Violation_CLEANED__#167)
>                                                                         +- 
> Project [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, 
> License Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, 
> Summons Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation 
> Time#128, Violation Time_CLEANED__#166, Violation#129, 
> Violation_CLEANED__#167, Judgment Entry Date#130, Judgment Entry 
> Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 16 more fields]
>                                                                            +- 
> Filter UDF(License Type_CLEANED__#164)
>                                                                               
> +- Project [Plate#123, Plate_CLEANED__#162, State#124, State_CLEANED__#163, 
> License Type#125, License Type_CLEANED__#164, Summons Number_CLEANED__#249, 
> Summons Number#126, Issue Date#127, Issue Date_CLEANED__#210, Violation 
> Time#128, Violation Time_CLEANED__#166, Violation#129, 
> Violation_CLEANED__#167, Judgment Entry Date#130, Judgment Entry 
> Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, Interest Amount_CLEANED__#250, 
> Interest Amount#133, Reduction Amount_CLEANED__#251, Reduction Amount#134, 
> ... 15 more fields]
>                                                                               
>    +- Filter UDF(State_CLEANED__#163)
>                                                                               
>       +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, CASE WHEN 
> isnull(Summons Number#126) THEN NaN ELSE Summons Number#126 END AS Summons 
> Number_CLEANED__#249, Summons Number#126, Issue Date#127, Issue 
> Date_CLEANED__#210, Violation Time#128, Violation Time_CLEANED__#166, 
> Violation#129, Violation_CLEANED__#167, Judgment Entry Date#130, Judgment 
> Entry Date_CLEANED__#211, Fine Amount#131, Fine Amount_CLEANED__#212, Penalty 
> Amount#132, Penalty Amount_CLEANED__#213, CASE WHEN isnull(Interest 
> Amount#133) THEN NaN ELSE Interest Amount#133 END AS Interest 
> Amount_CLEANED__#250, Interest Amount#133, CASE WHEN isnull(Reduction 
> Amount#134) THEN NaN ELSE Reduction Amount#134 END AS Reduction 
> Amount_CLEANED__#251, Reduction Amount#134, ... 14 more fields]
>                                                                               
>          +- Project [Plate#123, Plate_CLEANED__#162, State#124, 
> State_CLEANED__#163, License Type#125, License Type_CLEANED__#164, Summons 
> Number#126, Issue Date#127, CASE WHEN isnull(Issue Date_CLEANED__#165) THEN 
> NaN ELSE Issue Date_CLEANED__#165 END AS Issue Date_CLEANED__#210, Violation 
> Time#128, Violation Time_CLEANED__#166, Violation#129, 
> Violation_CLEANED__#167, Judgment Entry Date#130, CASE WHEN isnull(Judgment 
> Entry Date_CLEANED__#168) THEN NaN ELSE Judgment Entry Date_CLEANED__#168 END 
> AS Judgment Entry Date_CLEANED__#211, Fine Amount#131, CASE WHEN isnull(Fine 
> Amount_CLEANED__#169) THEN NaN ELSE Fine Amount_CLEANED__#169 END AS Fine 
> Amount_CLEANED__#212, Penalty Amount#132, CASE WHEN isnull(Penalty 
> Amount_CLEANED__#170) THEN NaN ELSE Penalty Amount_CLEANED__#170 END AS 
> Penalty Amount_CLEANED__#213, Interest Amount#133, Reduction Amount#134, 
> Payment Amount#135, Amount Due#136, Precinct#137, ... 9 more fields]
>                                                                               
>             +- Project [Plate#123, UDF(Plate#123) AS Plate_CLEANED__#162, 
> State#124, UDF(State#124) AS State_CLEANED__#163, License Type#125, 
> UDF(License Type#125) AS License Type_CLEANED__#164, Summons Number#126, 
> Issue Date#127, cast(Issue Date#127 as double) AS Issue Date_CLEANED__#165, 
> Violation Time#128, UDF(Violation Time#128) AS Violation Time_CLEANED__#166, 
> Violation#129, UDF(Violation#129) AS Violation_CLEANED__#167, Judgment Entry 
> Date#130, cast(Judgment Entry Date#130 as double) AS Judgment Entry 
> Date_CLEANED__#168, Fine Amount#131, cast(Fine Amount#131 as double) AS Fine 
> Amount_CLEANED__#169, Penalty Amount#132, cast(Penalty Amount#132 as double) 
> AS Penalty Amount_CLEANED__#170, Interest Amount#133, Reduction Amount#134, 
> Payment Amount#135, Amount Due#136, Precinct#137, ... 9 more fields]
>                                                                               
>                +- Project [Plate#6 AS Plate#123, State#7 AS State#124, 
> License Type#8 AS License Type#125, Summons Number#9 AS Summons Number#126, 
> Issue Date#10 AS Issue Date#127, Violation Time#11 AS Violation Time#128, 
> Violation#12 AS Violation#129, Judgment Entry Date#13 AS Judgment Entry 
> Date#130, Fine Amount#14 AS Fine Amount#131, Penalty Amount#15 AS Penalty 
> Amount#132, Interest Amount#16 AS Interest Amount#133, Reduction Amount#17 AS 
> Reduction Amount#134, Payment Amount#18 AS Payment Amount#135, Amount Due#19 
> AS Amount Due#136, Precinct#20 AS Precinct#137, County#21 AS County#138, 
> Issuing Agency#22 AS Issuing Agency#139, Violation Status#23 AS Violation 
> Status#140, columnBasedOnManyCols#43 AS columnBasedOnManyCols#141]
>                                                                               
>                   +- Project [Plate#6, State#7, License Type#8, Summons 
> Number#9, Issue Date#10, Violation Time#11, Violation#12, Judgment Entry 
> Date#13, Fine Amount#14, Penalty Amount#15, Interest Amount#16, Reduction 
> Amount#17, Payment Amount#18, Amount Due#19, Precinct#20, County#21, Issuing 
> Agency#22, Violation Status#23, 
> cast(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(UDF(Plate#6, State#7), License 
> Type#8), UDF(Summons Number#9)), UDF(Issue Date#10)), Violation Time#11), 
> Violation#12), UDF(Judgment Entry Date#13)), UDF(Fine Amount#14)), 
> UDF(Penalty Amount#15)), UDF(Interest Amount#16)) as string) AS 
> columnBasedOnManyCols#43]
>                                                                               
>                      +- Relation[Plate#6,State#7,License Type#8,Summons 
> Number#9,Issue Date#10,Violation Time#11,Violation#12,Judgment Entry 
> Date#13,Fine Amount#14,Penalty Amount#15,Interest Amount#16,Reduction 
> Amount#17,Payment Amount#18,Amount Due#19,Precinct#20,County#21,Issuing 
> Agency#22,Violation Status#23] csv
> {code}        



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to