Pat, I don't seem to find such spark specific code in cf.. cf code itself is engine agnostic. But of course you need some engine to use it. Similar to the distributed decomposition stuff in math-scala. They need some engine to run them, but the code itself is engine agnostic and in math-scala. Am I missing something basic here?
On Thu, Jun 19, 2014 at 11:47 AM, Pat Ferrel <[email protected]> wrote: > Actually it has several Spark deps like having an SparkContext, SparkConf, > and and rdd for file I/O > Please look before you vote. I’ve been waving this flag for awhile—I/O is > not engine neutral. > > > On Jun 19, 2014, at 11:41 AM, Sebastian Schelter <[email protected]> wrote: > > Hi Anand, > > Yes, this should not contain anything spark-specific. +1 for moving it. > > --sebastian > > > > On 06/19/2014 08:38 PM, Anand Avati wrote: > > Hi Pat and others, > > I see that cf/CooccuranceAnalysis.scala is currently under spark. Is > there > > a specific reason? I see that the code itself is completely spark > agnostic. > > I tried moving the code under > > math-scala/src/main/scala/org/apache/mahout/math/cf/ with the following > > trivial patch: > > > > diff --git > a/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala > > b/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala > > index ee44f90..bd20956 100644 > > --- > a/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala > > +++ > b/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala > > @@ -22,7 +22,6 @@ import scalabindings._ > > import RLikeOps._ > > import drm._ > > import RLikeDrmOps._ > > -import org.apache.mahout.sparkbindings._ > > import scala.collection.JavaConversions._ > > import org.apache.mahout.math.stats.LogLikelihood > > > > > > and it seems to work just fine. From what I see, this should work just > fine > > on H2O as well with no changes.. Why give up generality and make it spark > > specific? > > > > Thanks > > > > >
