I'm posting again, as the tables are not showing up in the emails.. I have a dataframe with few dimensions, for example:
+---+---+---+-----+ | i| j| k|total| +---+---+---+-----+ | 3| 1| 1| 3| | 3| 1| 2| 6| | 3| 1| 3| 9| | 3| 1| 4| 12| | 3| 1| 5| 15| | 3| 1| 6| 18| | 3| 1| 7| 21| | 3| 1| 8| 24| | 3| 1| 9| 27| | 3| 2| 1| 6| | 3| 2| 2| 12| | 3| 2| 3| 18| | 3| 2| 4| 24| | 3| 2| 5| 30| | 3| 2| 6| 36| | 3| 2| 7| 42| | 3| 2| 8| 48| | 3| 2| 9| 54| | 3| 3| 1| 9| | 3| 3| 2| 18| +---+---+---+-----+ I want to build a cube on i,j,k, and get a rank based on total per row (per grouping) so that when doing: df.filter('i===3 && 'j===1).show I will get +---+---+----+-----+----+ | i| j| k|total|rank| +---+---+----+-----+----+ | 3| 1|null| 135| 1| | 3| 1| 0| 0| 10| | 3| 1| 1| 3| 9| | 3| 1| 2| 6| 8| | 3| 1| 3| 9| 7| | 3| 1| 4| 12| 6| | 3| 1| 5| 15| 5| | 3| 1| 6| 18| 4| | 3| 1| 7| 21| 3| | 3| 1| 8| 24| 2| | 3| 1| 9| 27| 1| +---+---+----+-----+----+ so basically, for any grouping combination, i need a separated dense rank list (i,j,k, i,j, i,k, i, j,k, j, k) Any ideas? (in this example, total = i*j*k ) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/ranks-and-cubes-tp27338p27339.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org