I will increase memory for the job...that will also fix it right ?
On Apr 10, 2015 12:43 PM, Reza Zadeh r...@databricks.com wrote:
You should pull in this PR: https://github.com/apache/spark/pull/5364
It should resolve that. It is in master.
Best,
Reza
On Fri, Apr 10, 2015 at 8:32 AM,
Depends... The heartbeat you received happens due to GC pressure (probably
due to Full GC). If you increase the memory too much, the GC's may be less
frequent, but the Full GC's may take longer. Try increasing the following
confs:
spark.executor.heartbeatInterval
You should pull in this PR: https://github.com/apache/spark/pull/5364
It should resolve that. It is in master.
Best,
Reza
On Fri, Apr 10, 2015 at 8:32 AM, Debasish Das debasish.da...@gmail.com
wrote:
Hi,
I am benchmarking row vs col similarity flow on 60M x 10M matrices...
Details are in
...@gmail.com
To: Reza Zadeh r...@databricks.com mailto:r...@databricks.com
Cc: user user@spark.apache.org mailto:user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair
*To:* Reza Zadeh r...@databricks.com
*Cc:* user user@spark.apache.org
*Sent:* Saturday, January 17, 2015 11:29 AM
*Subject:* Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair-wise distance, on second thought. Is there a pattern
MapReduce impl and the Spark DSL impl per ur
preference.
From: Andrew Musselman andrew.mussel...@gmail.com
To: Reza Zadeh r...@databricks.com
Cc: user user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think
...@gmail.com
mailto:andrew.mussel...@gmail.com
To: Reza Zadeh r...@databricks.com mailto:r...@databricks.com
Cc: user user@spark.apache.org mailto:user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think what I
@spark.apache.org mailto:user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair-wise distance, on second thought. Is there a pattern for
that?
On Jan 16, 2015, at 9
impl per ur
preference.
From: Andrew Musselman andrew.mussel...@gmail.com
To: Reza Zadeh r...@databricks.com
Cc: user user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want
...@databricks.com
*Cc:* user user@spark.apache.org
*Sent:* Saturday, January 17, 2015 11:29 AM
*Subject:* Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair-wise distance, on second thought. Is there a pattern for
that?
On Jan 16, 2015, at 9:53 PM
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair-wise distance, on second thought. Is there a pattern for that?
On Jan 16, 2015, at 9:53 PM, Reza Zadeh r...@databricks.com wrote:
You can use K-means with a suitably large k. Each cluster should correspond
Musselman andrew.mussel...@gmail.com
To: Reza Zadeh r...@databricks.com
Cc: user user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair-wise distance, on second
Musselman andrew.mussel...@gmail.com
mailto:andrew.mussel...@gmail.com
To: Reza Zadeh r...@databricks.com mailto:r...@databricks.com
Cc: user user@spark.apache.org mailto:user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting
: Andrew Musselman andrew.mussel...@gmail.com
To: Reza Zadeh r...@databricks.com
Cc: user user@spark.apache.org
Sent: Saturday, January 17, 2015 11:29 AM
Subject: Re: Row similarities
Thanks Reza, interesting approach. I think what I actually want is to
calculate pair-wise distance
What's a good way to calculate similarities between all vector-rows in a
matrix or RDD[Vector]?
I'm seeing RowMatrix has a columnSimilarities method but I'm not sure I'm
going down a good path to transpose a matrix in order to run that.
You can use K-means
https://spark.apache.org/docs/latest/mllib-clustering.html with a
suitably large k. Each cluster should correspond to rows that are similar
to one another.
On Fri, Jan 16, 2015 at 5:18 PM, Andrew Musselman
andrew.mussel...@gmail.com wrote:
What's a good way to calculate
16 matches
Mail list logo