sriram kumar created SPARK-17973: ------------------------------------ Summary: is there any way to split Dataset into 2 or more based on the given condition Key: SPARK-17973 URL: https://issues.apache.org/jira/browse/SPARK-17973 Project: Spark Issue Type: Question Components: Java API Reporter: sriram kumar Priority: Critical
i cannot able to split Dataset exactly with condition. i have a scenario where i need to split single Dataset into 4 dataset and non of matched to be in 5th dataset. there bellow i am taking some baby steps. this is my data. +---------------+-----+--------------+----+----+ | Name|Class| Dorm|Room| GPA| +---------------+-----+--------------+----+----+ |Sally Whittaker| 2018|McCarren House| 312|3.75| |Belinda Jameson| 2017| Cushing House| 148|3.52| | Jeff Smith| 2018|Prescott House|17-D| 3.2| | Sandy Allen| 2019| Oliver House| 108|3.48| +---------------+-----+--------------+----+----+ Dataset<Row> s1 = s.selectExpr("upper(Name) as Name" , "Class"); s1.filter("Class > 2017 and Room > 200").show(); +---------------+-----+ | Name|Class| +---------------+-----+ |SALLY WHITTAKER| 2018| +---------------+-----+ then for what code can i get remaining data in the Dataset.? i tryed below one. but this going wrong. s1.filter("!(Class > 2017 and Room > 200)").show(); +---------------+-----+ | Name|Class| +---------------+-----+ |BELINDA JAMESON| 2017| | SANDY ALLEN| 2019| +---------------+-----+ i do got to know why it going wrong. but i don't get answer , how to get those filter data -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org