Hello, I want to know what are the cons and performance impacts of using a var inside class object in a Rdd.
Here is a example: Animal is a huge class with n number of val type variables (approx >600 variables), but frequently, we will have to update Age(just 1 variable) after some computation. What is the best way to do it? Class Animal(age: Int, name; String) = { var animalAge:Int = age val animalName:String = name val ...... } val animalRdd = sc.parallelize(List(Animal(1,"XYZ"), Animal(2,"ABC") )) ... ... animalRdd.map(ani=>{ if(ani.yearChange()) ani.animalAge+=1 ani }) Is it advisable to use var in this case? Or can I do ani.copy(animalAge=2) which will reallocate the memory altogether for the animal. Please advice which is the best way to handle such cases. Regards Hemalatha