I don't think inner join will solve my problem. *For each row in* paramsDataset, I need to filter mydataset. And then I need to run a bunch of calculation on filtered myDataset.
Say for example paramsDataset has three employee age ranges . Eg: 20-30,30-50, 50-60 and regions USA,Canada. myDataset has all employees information for three years. Like the days a person came to work , took day off etc. I need to calculate the average number of days employee worked per age range for different regions. Average day off per age range etc. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org