[ https://issues.apache.org/jira/browse/SPARK-21299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074261#comment-16074261 ]
jalendhar Baddam commented on SPARK-21299: ------------------------------------------ policyID,statecode,county,eq_site_limit,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 119736,FL,CLAY COUNTY,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,Residential,Masonry,1 448094,FL,CLAY COUNTY,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,Residential,Masonry,3 206893,FL,CLAY COUNTY,190724.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,Residential,Wood,1 333743,FL,CLAY COUNTY,0,79520.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,Residential,Wood,3 172534,FL,CLAY COUNTY,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,Residential,Wood,1 785275,FL,CLAY COUNTY,0,515035.62,0,0,515035.62,884419.17,0,0,0,0,30.063236,-81.707703,Residential,Masonry,3 995932,FL,CLAY COUNTY,0,19260000,0,0,19260000,20610000,0,0,0,0,30.102226,-81.713882,Commercial,Reinforced Concrete,1 223488,FL,CLAY COUNTY,328500,328500,328500,328500,328500,348374.25,0,16425,0,0,30.102217,-81.707146,Residential,Wood,1 433512,FL,CLAY COUNTY,315000,315000,315000,315000,315000,265821.57,0,15750,0,0,30.118774,-81.704613,Residential,Wood,1 142071,FL,CLAY COUNTY,705600,705600,705600,705600,705600,1010842.56,14112,35280,0,0,30.100628,-81.703751,Residential,Masonry,1 253816,FL,CLAY COUNTY,831498.3,831498.3,831498.3,831498.3,831498.3,1117791.48,0,0,0,0,30.10216,-81.719444,Residential,Masonry,1 894922,FL,CLAY COUNTY,0,24059.09,0,0,24059.09,33952.19,0,0,0,0,30.095957,-81.695099,Residential,Wood,1 > except is throwing the fallowing exception after perform dropDuplicates on > the Dataset object > --------------------------------------------------------------------------------------------- > > Key: SPARK-21299 > URL: https://issues.apache.org/jira/browse/SPARK-21299 > Project: Spark > Issue Type: Bug > Components: Java API > Affects Versions: 2.1.0 > Environment: spark 2.1.0 > Reporter: jalendhar Baddam > > INFO: org.apache.spark.sql.AnalysisException: resolved attribute(s) > test_customer_CustID#569 missing from > test_customer_ROW_NUM#589L,test_customer_CustID#590,test_customer_Telephone#598L,test_customer_HouseholdID#593,test_customer_Gender#592,test_customer_Title#599,test_customer_Surname#597,test_customer_Occupation#596,test_customer_DOB#591,test_customer_Initials#595,test_customer_Income#594 > in operator !Filter (cast(test_customer_CustID#569 as double) > cast(1000 as > double));; > INFO: Except > INFO: :- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, > test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Sort [test_customer_ROW_NUM#212L ASC NULLS FIRST], true > INFO: : +- Project [test_customer_ROW_NUM#212L, test_customer_CustID#213, > test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- SubqueryAlias 1922a657-80bd-41a5-8e1f-04a248263e47 > INFO: : +- Aggregate [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222], [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Project [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Project [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Aggregate [test_customer_Gender#215], > [first(test_customer_ROW_NUM#212L, false) AS test_customer_ROW_NUM#212L, > first(test_customer_CustID#213, false) AS test_customer_CustID#213, > first(test_customer_DOB#214, false) AS test_customer_DOB#214, > test_customer_Gender#215, first(test_customer_HouseholdID#216, false) AS > test_customer_HouseholdID#216, first(test_customer_Income#217, false) AS > test_customer_Income#217, first(test_customer_Initials#218, false) AS > test_customer_Initials#218, first(test_customer_Occupation#219, false) AS > test_customer_Occupation#219, first(test_customer_Surname#220, false) AS > test_customer_Surname#220, first(test_customer_Telephone#221L, false) AS > test_customer_Telephone#221L, first(test_customer_Title#222, false) AS > test_customer_Title#222] > INFO: : +- Project [test_customer_ROW_NUM#212L, > test_customer_CustID#213, test_customer_DOB#214, test_customer_Gender#215, > test_customer_HouseholdID#216, test_customer_Income#217, > test_customer_Initials#218, test_customer_Occupation#219, > test_customer_Surname#220, test_customer_Telephone#221L, > test_customer_Title#222] > INFO: : +- Filter (cast(test_customer_CustID#213 as > double) > cast(1000 as double)) > INFO: : +- Project [ROW_NUM#47L AS > test_customer_ROW_NUM#212L, CustID#48 AS test_customer_CustID#213, DOB#49 AS > test_customer_DOB#214, Gender#50 AS test_customer_Gender#215, HouseholdID#51 > AS test_customer_HouseholdID#216, Income#52 AS test_customer_Income#217, > Initials#53 AS test_customer_Initials#218, Occupation#54 AS > test_customer_Occupation#219, Surname#55 AS test_customer_Surname#220, > Telephone#56L AS test_customer_Telephone#221L, Title#57 AS > test_customer_Title#222] > INFO: : +- SubqueryAlias customer > INFO: : +- > Relation[ROW_NUM#47L,CustID#48,DOB#49,Gender#50,HouseholdID#51,Income#52,Initials#53,Occupation#54,Surname#55,Telephone#56L,Title#57] > parquet > INFO: +- Project [test_customer_ROW_NUM#568L, test_customer_CustID#569, > test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- GlobalLimit 0 > INFO: +- LocalLimit 0 > INFO: +- Sort [test_customer_ROW_NUM#568L ASC NULLS FIRST], true > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- SubqueryAlias 1922a657-80bd-41a5-8e1f-04a248263e47 > INFO: +- Aggregate [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577], [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Project [test_customer_ROW_NUM#568L, > test_customer_CustID#569, test_customer_DOB#570, test_customer_Gender#592, > test_customer_HouseholdID#571, test_customer_Income#572, > test_customer_Initials#573, test_customer_Occupation#574, > test_customer_Surname#575, test_customer_Telephone#576L, > test_customer_Title#577] > INFO: +- Aggregate [test_customer_Gender#592], > [first(test_customer_ROW_NUM#568L, false) AS test_customer_ROW_NUM#568L, > first(test_customer_CustID#569, false) AS test_customer_CustID#569, > first(test_customer_DOB#570, false) AS test_customer_DOB#570, > test_customer_Gender#592, first(test_customer_HouseholdID#571, false) AS > test_customer_HouseholdID#571, first(test_customer_Income#572, false) AS > test_customer_Income#572, first(test_customer_Initials#573, false) AS > test_customer_Initials#573, first(test_customer_Occupation#574, false) AS > test_customer_Occupation#574, first(test_customer_Surname#575, false) AS > test_customer_Surname#575, first(test_customer_Telephone#576L, false) AS > test_customer_Telephone#576L, first(test_customer_Title#577, false) AS > test_customer_Title#577] > INFO: +- Project > [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, > test_customer_Gender#592, test_customer_HouseholdID#571, > test_customer_Income#572, test_customer_Initials#573, > test_customer_Occupation#574, test_customer_Surname#575, > test_customer_Telephone#576L, test_customer_Title#577] > INFO: +- !Project > [test_customer_ROW_NUM#568L, test_customer_CustID#569, test_customer_DOB#570, > test_customer_Gender#592, test_customer_HouseholdID#571, > test_customer_Income#572, test_customer_Initials#573, > test_customer_Occupation#574, test_customer_Surname#575, > test_customer_Telephone#576L, test_customer_Title#577] > INFO: +- !Filter > (cast(test_customer_CustID#569 as double) > cast(1000 as double)) > INFO: +- Project [ROW_NUM#47L AS > test_customer_ROW_NUM#589L, CustID#48 AS test_customer_CustID#590, DOB#49 AS > test_customer_DOB#591, Gender#50 AS test_customer_Gender#592, HouseholdID#51 > AS test_customer_HouseholdID#593, Income#52 AS test_customer_Income#594, > Initials#53 AS test_customer_Initials#595, Occupation#54 AS > test_customer_Occupation#596, Surname#55 AS test_customer_Surname#597, > Telephone#56L AS test_customer_Telephone#598L, Title#57 AS > test_customer_Title#599] > INFO: +- SubqueryAlias customer > INFO: +- > Relation[ROW_NUM#47L,CustID#48,DOB#49,Gender#50,HouseholdID#51,Income#52,Initials#53,Occupation#54,Surname#55,Telephone#56L,Title#57] > parquet > INFO: > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40) > INFO: at > org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:57) > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:337) > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:67) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:128) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:127) > INFO: at scala.collection.immutable.List.foreach(List.scala:381) > INFO: at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127) > INFO: at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:67) > INFO: at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:57) > INFO: at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48) > INFO: at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63) > INFO: at > org.apache.spark.sql.Dataset.withSetOperator(Dataset.scala:2834) > INFO: at org.apache.spark.sql.Dataset.except(Dataset.scala:1652) -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org