Not sure if the previous mail got through I'm in a car
No spark deps in cf/cooccurrence it can be moved The deps are in I/O code in ItemSimilarityJob the subject of the pr just before your first email Sorry for the confusion Sent from my iPhone > On Jun 19, 2014, at 12:06 PM, Anand Avati <[email protected]> wrote: > > Pat, > I don't seem to find such spark specific code in cf.. cf code itself is > engine agnostic. But of course you need some engine to use it. Similar to > the distributed decomposition stuff in math-scala. They need some engine to > run them, but the code itself is engine agnostic and in math-scala. Am I > missing something basic here? > > >> On Thu, Jun 19, 2014 at 11:47 AM, Pat Ferrel <[email protected]> wrote: >> >> Actually it has several Spark deps like having an SparkContext, SparkConf, >> and and rdd for file I/O >> Please look before you vote. I’ve been waving this flag for awhile—I/O is >> not engine neutral. >> >> >> On Jun 19, 2014, at 11:41 AM, Sebastian Schelter <[email protected]> wrote: >> >> Hi Anand, >> >> Yes, this should not contain anything spark-specific. +1 for moving it. >> >> --sebastian >> >> >> >>> On 06/19/2014 08:38 PM, Anand Avati wrote: >>> Hi Pat and others, >>> I see that cf/CooccuranceAnalysis.scala is currently under spark. Is >> there >>> a specific reason? I see that the code itself is completely spark >> agnostic. >>> I tried moving the code under >>> math-scala/src/main/scala/org/apache/mahout/math/cf/ with the following >>> trivial patch: >>> >>> diff --git >> a/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala >>> b/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala >>> index ee44f90..bd20956 100644 >>> --- >> a/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala >>> +++ >> b/spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala >>> @@ -22,7 +22,6 @@ import scalabindings._ >>> import RLikeOps._ >>> import drm._ >>> import RLikeDrmOps._ >>> -import org.apache.mahout.sparkbindings._ >>> import scala.collection.JavaConversions._ >>> import org.apache.mahout.math.stats.LogLikelihood >>> >>> >>> and it seems to work just fine. From what I see, this should work just >> fine >>> on H2O as well with no changes.. Why give up generality and make it spark >>> specific? >>> >>> Thanks >> >> >>
