k.
The conditions of machines and Spark settings are as follows.
1)six machines, physical memory is 32GB of each machine.
2)Spark settings
- spark.executor.memory=16g
- spark.closure.serializer=org.apache.spark.serializer.KryoSerializer
- spark.rdd.compress=true
- spark.shuffle.memoryFraction=
I got answer from mail posted to ML.
--- Summary ---
cache() is lazy, so you can use `RDD.count()` explicitly to load into
memory.
---
And I tried, two RDDs were cached and the speed became faster.
Thank you.
--
View this message in context:
http://apache-spark-user-list.1001560.
such
> as `model .userFeatures.getStorageLevel()`.
I printed the return value of getStorageLevel() "userFeatures" and
"productFeatures",
both were "Memory Deserialized 1x Replicated" .
I think, two variables were configured to cache,
but didn't cach
Hello.
I create program, collaborative filtering using Spark,
but I have trouble with calculating speed.
I want to implement recommendation program using ALS (MLlib),
which is another process from Spark.
But access speed of MatrixFactorizationModel object on HDFS is slow,
so I want to cache it, b