sorry i meant to say SPARK-18980
On Sat, Jan 21, 2017 at 1:48 AM, Koert Kuipers wrote:
> found it :) SPARK-1890
> thanks cloud-fan
>
> On Sat, Jan 21, 2017 at 1:46 AM, Koert Kuipers wrote:
>
>> trying to replicate this in spark itself i can for v2.1.0 but not for
>> master. i guess it has been
I'm downstream stages the labels & features are generally expected to be
doubles, so its easier to use as a double.
On Sat, Jan 21, 2017 at 5:32 PM Shiyuan wrote:
> Hi Spark,
> StringIndex uses double instead of int for indexing
> http://spark.apache.org/docs/latest/ml-features.html#stringindexe
Hi Spark,
StringIndex uses double instead of int for indexing
http://spark.apache.org/docs/latest/ml-features.html#stringindexer. What's
the rationale for using double to index? Would it be more appropriate to
use int to index (which is consistent with other place like Vector.sparse)
Shiyuan
i noticed when doing maven deploy for spark (for inhouse release) that it
tries to upload certain artifacts multiple times. for example it tried to
upload spark-network-common tests jar twice.
our inhouse repo doesnt appreciate this for releases. it will refuse the
second time.
also it makes no s
I wouldn't say that Executors are dumb, but there are some pretty clear
divisions of concepts and responsibilities across the different pieces of
the Spark architecture. A Job is a concept that is completely unknown to an
Executor, which deals instead with just the Tasks that it is given. So you
a
No
Thank you.
Daniel
On 20 Jan 2017, at 23:28, kant kodali
mailto:kanth...@gmail.com>> wrote:
Hi,
I am running spark standalone with no storage. when I use spark-submit to
submit my job I get the following Exception and I wonder if this is something
to worry about?
java.io.IOException: HAD
Executors are "dumb", i.e. they execute TaskRunners for tasks and...that's it.
Your logic should be on the driver that can intercept events
and...trigger cleanup.
I don't think there's another way to do it.
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spa
I am working with datasets of the order of 200 GB using 286 cores divided
across 143 executor. Each executor has 32 Gb (which makes every core 15
Gb). And I am using Spark 1.6.
I would like to tune the spark.locality.wait. Does anyone can give me a
range on the values of spark.locality wait that