e
> much difference in the run times.
>
> -s
>
>
> On 29.09.2016 22:17, Arnau Sanchez wrote:
> > --input ratings --output spark-itemsimilarity --maxSimilaritiesPerItem 10
ritiesPerItem 10 --master yarn-client |& tee
spark-itemsimilarity.out
Thanks!
On Thu, 29 Sep 2016 19:46:03 +0200 Arnau Sanchez <pyar...@gmail.com> wrote:
> Hi Sebastian,
>
> That's weird, it works here. Anyway, a Dropbox link:
>
> https://www.dropbox.com/sh/ex0d74sc
nd create a model every week and need 4
> r3.8xlarge to do it in 1 hour you only pay 1/168th of what you would for a
> permanent cluster. This brings the cost to a quite reasonable range. You are
> very unlikely to need machines that large anyway but you could afford it if
> you only
On Sun, 25 Sep 2016 09:01:43 -0700 Pat Ferrel wrote:
> AWS EMR is usually not very well suited for Spark.
What infrastructure would you recommend? Some EC2 instances provide lots of
memory (though maybe not with the most competitive price: r3.8xlarge, 244Gb
RAM).
My
I've been using the Mahout itemsimilarity job for a while, with good results. I
read that the new spark-itemsimilarity job is typically faster, by a factor of
10, so I wanted to give it a try. I must be doing something wrong because, with
the same EMR infrastructure, the spark job is slower