[ https://issues.apache.org/jira/browse/SPARK-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sriram kumar updated SPARK-17973: --------------------------------- Description: i cannot able to split Dataset exactly with condition. i have a scenario where i need to split single Dataset into 4 dataset and non of matched to be in 5th dataset. here bellow i am taking some baby steps. this is my data. +---------------+-----+--------------+----+----+ | Name|Class| Dorm|Room| GPA| +---------------+-----+--------------+----+----+ |Sally Whittaker| 2018|McCarren House| 312|3.75| |Belinda Jameson| 2017| Cushing House| 148|3.52| | Jeff Smith| 2018|Prescott House|17-D| 3.2| | Sandy Allen| 2019| Oliver House| 108|3.48| +---------------+-----+--------------+----+----+ Dataset<Row> s1 = s.selectExpr("upper(Name) as Name" , "Class"); s1.filter("Class > 2017 and Room > 200").show(); +---------------+-----+ | Name|Class| +---------------+-----+ |SALLY WHITTAKER| 2018| +---------------+-----+ then for what code can i get remaining data in the Dataset.? i tryed below one. but this going wrong. s1.filter("!(Class > 2017 and Room > 200)").show(); +---------------+-----+ | Name|Class| +---------------+-----+ |BELINDA JAMESON| 2017| | SANDY ALLEN| 2019| +---------------+-----+ i do got to know why it going wrong. but i don't get answer , how to get those filter data was: i cannot able to split Dataset exactly with condition. i have a scenario where i need to split single Dataset into 4 dataset and non of matched to be in 5th dataset. there bellow i am taking some baby steps. this is my data. +---------------+-----+--------------+----+----+ | Name|Class| Dorm|Room| GPA| +---------------+-----+--------------+----+----+ |Sally Whittaker| 2018|McCarren House| 312|3.75| |Belinda Jameson| 2017| Cushing House| 148|3.52| | Jeff Smith| 2018|Prescott House|17-D| 3.2| | Sandy Allen| 2019| Oliver House| 108|3.48| +---------------+-----+--------------+----+----+ Dataset<Row> s1 = s.selectExpr("upper(Name) as Name" , "Class"); s1.filter("Class > 2017 and Room > 200").show(); +---------------+-----+ | Name|Class| +---------------+-----+ |SALLY WHITTAKER| 2018| +---------------+-----+ then for what code can i get remaining data in the Dataset.? i tryed below one. but this going wrong. s1.filter("!(Class > 2017 and Room > 200)").show(); +---------------+-----+ | Name|Class| +---------------+-----+ |BELINDA JAMESON| 2017| | SANDY ALLEN| 2019| +---------------+-----+ i do got to know why it going wrong. but i don't get answer , how to get those filter data > is there any way to split Dataset into 2 or more based on the given condition > ----------------------------------------------------------------------------- > > Key: SPARK-17973 > URL: https://issues.apache.org/jira/browse/SPARK-17973 > Project: Spark > Issue Type: Question > Components: Java API > Reporter: sriram kumar > Priority: Critical > > i cannot able to split Dataset exactly with condition. i have a scenario > where i need to split single Dataset into 4 dataset and non of matched to be > in 5th dataset. here bellow i am taking some baby steps. > this is my data. > +---------------+-----+--------------+----+----+ > | Name|Class| Dorm|Room| GPA| > +---------------+-----+--------------+----+----+ > |Sally Whittaker| 2018|McCarren House| 312|3.75| > |Belinda Jameson| 2017| Cushing House| 148|3.52| > | Jeff Smith| 2018|Prescott House|17-D| 3.2| > | Sandy Allen| 2019| Oliver House| 108|3.48| > +---------------+-----+--------------+----+----+ > Dataset<Row> s1 = s.selectExpr("upper(Name) as Name" , "Class"); > s1.filter("Class > 2017 and Room > 200").show(); > +---------------+-----+ > | Name|Class| > +---------------+-----+ > |SALLY WHITTAKER| 2018| > +---------------+-----+ > then for what code can i get remaining data in the Dataset.? > i tryed below one. but this going wrong. > s1.filter("!(Class > 2017 and Room > 200)").show(); > > +---------------+-----+ > | Name|Class| > +---------------+-----+ > |BELINDA JAMESON| 2017| > | SANDY ALLEN| 2019| > +---------------+-----+ > i do got to know why it going wrong. but i don't get answer , how to get > those filter data -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org