Maybe you can refer sliding method of RDD, but it's right now mllib private
method.
Look at org.apache.spark.mllib.rdd.RDDFunctions.
2014-08-26 12:59 GMT+08:00 Vida Ha v...@databricks.com:
Can you paste the code? It's unclear to me how/when the out of memory is
occurring without seeing the
It sounds like you are adding the same key to every element, and joining,
in order to accomplish a full cartesian join? I can imagine doing it that
way would blow up somewhere. There is a cartesian() method to do this maybe
more efficiently.
However if your data set is large, this sort of
Hello everyone,
I am transplanting a clustering algorithm to spark platform, and I meet
a problem confusing me for a long time, can someone help me?
I have a PairRDDInteger, Integer named patternRDD, which the key
represents a number and the value stores an information of the key. And I
Can you paste the code? It's unclear to me how/when the out of memory is
occurring without seeing the code.
On Sun, Aug 24, 2014 at 11:37 PM, Gefei Li gefeili.2...@gmail.com wrote:
Hello everyone,
I am transplanting a clustering algorithm to spark platform, and I
meet a problem