Hello,
In MLLib with Spark 1.4, I was able to eval a model by loading it and using
`predict` on a vector of features. I would train on Spark but use my model on
my workflow.
In `spark.ml` it seems like the only way to eval is to use `transform` which
only takes a DataFrame.To build a DataFrame
In Spark 1.6
if I do (column name has dot in it, but is not a nested column):
df = df.withColumn("raw.hourOfDay", df.col("`raw.hourOfDay`"))scala> df =
df.withColumn("raw.hourOfDay",
df.col("`raw.hourOfDay`"))org.apache.spark.sql.AnalysisException: cannot
resolve 'raw.minOfDay' given input colu
Hello,When I used to submit a job with spark 1.4, it would return a job ID and
a status RUNNING, FAILED or something like this.I just upgraded to 1.6 and
there is no status returned by spark-submitIs there a way to get this
information back?
When I submit a job I want to know which one it is
Hello,When i used to submit a job with spark 1.4, it would return a job ID and
a status RUNNING, FAILED or something like this.I just upgraded to 1.6 and
there is no status returned by spark-submitIs there a way to get this
information back? When submit a job i want to know which one it is
uffice.
>
> My $0.02,
> dean
>
> Dean Wampler, Ph.D.
> Author: Programming Scala, 2nd Edition
> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
> Typesafe <http://typesafe.com>
> @deanwampler <http://twitter.com/deanwampler>
> http://p
Hi,
I'm working on a Spark Streaming application and I would like to know what
is the best storage to use
for checkpointing.
For testing purposes we're are using NFS between the worker, the master and
the driver program (in client mode),
but we have some issues with the CheckpointWriter (1 thread
It did the job.
Thanks. :)
Le 19 août 2014 à 10:20, Sean Owen a écrit :
> In that case, why not collectAsMap() and have the whole result as a
> simple Map in memory? then lookups are trivial. RDDs aren't
> distributed maps.
>
> On Tue, Aug 19, 2014 at 9:17 AM, Emmanue
> It will never be efficient like a database lookup since this is
> implemented by scanning through all of the data. There is no index or
> anything.
>
> On Tue, Aug 19, 2014 at 8:43 AM, Emmanuel Castanier
> wrote:
>> Hi all,
>>
>> I’m totally newbie on Spark,
this :
myRdd.filter(t => t._1.equals(param))
If I make a collect to get the only « tuple » , It takes about 12 seconds to
execute, I imagine that’s because Spark may be used differently...
Best regards,
Emmanuel
-
To unsubscribe