Hi,

I would like to call a method on JavaPairRDD from Scala and I am not sure how 
to write a function for the "map". I am using a third-party library that uses 
Spark for geospatial computations and it happens that it returns some results 
through Java API. I'd welcome a hint how to write a function for 'map' such 
that JavaPairRDD is happy.

Here's a signature:
org.apache.spark.api.java.JavaPairRDD[com.vividsolutions.jts.geom.Polygon,java.util.HashSet[com.vividsolutions.jts.geom.Polygon]]
 = org.apache.spark.api.java.JavaPairRDD

Normally I would write something like this:

def calculate_intersection(polygon: Polygon, hashSet: HashSet[Polygon]) = {
  (polygon, hashSet.asScala.map(polygon.intersection(_).getArea))
}

javapairrdd.map(calculate_intersection)


... but it will complain that it's not a Java Function.

My first thought was to implement the interface, i.e.:


class PairRDDWrapper extends 
org.apache.spark.api.java.function.Function2[Polygon, HashSet[Polygon]]
{
  override def call(polygon: Polygon, hashSet: HashSet[Polygon]): (Polygon, 
scala.collection.mutable.Set[Double]) = {
    (polygon, hashSet.asScala.map(polygon.intersection(_).getArea))
  }
}




I am not sure though how to use it, or if it makes any sense in the first 
place. Should be simple, it's just my Java / Scala is "little rusty".


Cheers,
Lucas

Reply via email to