Hi,
I dont know why I receive the message
WARN KMeans: The input data is not directly cached, which may hurt
performance if its parent RDDs are also uncached.
when I try to use Spark Kmeans
df_Part = assembler.transform(df_Part)
df_Part.cache()while (k<=max_cluster) and (wssse > seuilStop):
Hi,
Is there any way to convert a spark dataframe into numpy ndarray without
using toPandas operation ?
Example:
C1 C2 C3 C4 0.7 3.0 1000 109540.9 4.2 1200 12345
I want to get this output:
[(0.7, 3.0, 1000L, 10954),(0.9, 4.2, 1200L, 12345)],
dtype=[('C1', '
here you can find more information about the code of my class "
RandomForestRegression..java" :
http://spark.apache.org/docs/latest/mllib-ensembles.html#regression
ᐧ
2016-08-11 10:18 GMT+02:00 Zakaria Hili :
> Hi,
>
> I recognize that spark can't save generated model
Hi,
I recognize that spark can't save generated model on HDFS (I'm used random
forest regression and linear regression for this test).
it can save only the data directory as you can see in the picture bellow :
[image: Images intégrées 1]
but to load a model I will need some data from metadata di
Hi,
I create a dataframe using a schema, but when I try to create a model, I
receive this error:
requirement failed: Column features must be of type
org.apache.spark.mllib.linalg.VectorUDT@f71b0bce but was actually
ArrayType(StringType,true)
piece of code
SQLContext sqlContext = SQL
Hi,
I m newbie in spark and I want to ask you a simple question.
I have an JavaDStream which contains data selected from sql database.
something like (id, user, score ...)
and I want to convert the JavaDStream to a dataframe .
how can I do this with java ?
Thank you
ᐧ
I want to use spark streaming to read data from RDBMS database like mysql.
but I don't know how to do this using JavaStreamingContext
JavaStreamingContext jssc = new JavaStreamingContext(conf,
Durations.milliseconds(500));DataFrame df = jssc. ??
I search in the internet but I didn't find anythi